This question should be answered using the Weekly data set, which is part of the ISLR package. This data is similar in nature to the Smarket data from this chapter’s lab, except that it contains 1, 089 weekly returns for 21 years, from the beginning of 1990 to the end of 2010.
Produce some numerical and graphical summaries of the Weekly data. Do there appear to be any patterns?
## [1] 1089 9
## Year Lag1 Lag2 Lag3
## Min. :1990 Min. :-18.1950 Min. :-18.1950 Min. :-18.1950
## 1st Qu.:1995 1st Qu.: -1.1540 1st Qu.: -1.1540 1st Qu.: -1.1580
## Median :2000 Median : 0.2410 Median : 0.2410 Median : 0.2410
## Mean :2000 Mean : 0.1506 Mean : 0.1511 Mean : 0.1472
## 3rd Qu.:2005 3rd Qu.: 1.4050 3rd Qu.: 1.4090 3rd Qu.: 1.4090
## Max. :2010 Max. : 12.0260 Max. : 12.0260 Max. : 12.0260
## Lag4 Lag5 Volume Today
## Min. :-18.1950 Min. :-18.1950 Min. :0.08747 Min. :-18.1950
## 1st Qu.: -1.1580 1st Qu.: -1.1660 1st Qu.:0.33202 1st Qu.: -1.1540
## Median : 0.2380 Median : 0.2340 Median :1.00268 Median : 0.2410
## Mean : 0.1458 Mean : 0.1399 Mean :1.57462 Mean : 0.1499
## 3rd Qu.: 1.4090 3rd Qu.: 1.4050 3rd Qu.:2.05373 3rd Qu.: 1.4050
## Max. : 12.0260 Max. : 12.0260 Max. :9.32821 Max. : 12.0260
## Direction
## Down:484
## Up :605
##
##
##
##
## Year Lag1 Lag2 Lag3 Lag4
## Year 1.00000000 -0.032289274 -0.03339001 -0.03000649 -0.031127923
## Lag1 -0.03228927 1.000000000 -0.07485305 0.05863568 -0.071273876
## Lag2 -0.03339001 -0.074853051 1.00000000 -0.07572091 0.058381535
## Lag3 -0.03000649 0.058635682 -0.07572091 1.00000000 -0.075395865
## Lag4 -0.03112792 -0.071273876 0.05838153 -0.07539587 1.000000000
## Lag5 -0.03051910 -0.008183096 -0.07249948 0.06065717 -0.075675027
## Volume 0.84194162 -0.064951313 -0.08551314 -0.06928771 -0.061074617
## Today -0.03245989 -0.075031842 0.05916672 -0.07124364 -0.007825873
## Lag5 Volume Today
## Year -0.030519101 0.84194162 -0.032459894
## Lag1 -0.008183096 -0.06495131 -0.075031842
## Lag2 -0.072499482 -0.08551314 0.059166717
## Lag3 0.060657175 -0.06928771 -0.071243639
## Lag4 -0.075675027 -0.06107462 -0.007825873
## Lag5 1.000000000 -0.05851741 0.011012698
## Volume -0.058517414 1.00000000 -0.033077783
## Today 0.011012698 -0.03307778 1.000000000
Tenemos una base de datos que contiene 9 variables con 1089 observaciones. Al observar la matriz de correlacion notamos una alta dependencia positiva entre el año y el voumen, entre el resto de las variables no hay correlaciones significativas.
set.seed(2020)
plot(Volume ~ jitter(Year, factor = 2.5), data = Weekly,
xlab = "Year", ylab = "Volume (billions)")Es evidente la dependencia lineal positiva entre la variable volumen y la variable año, sin embargo a partir del añ0 2005 esta dependencia no es tan notoria ya que vemos una mayor dispersion de los datos.
Use the full data set to perform a logistic regression with Direction as the response and the five lag variables plus Volume as predictors. Use the summary function to print the results. Do any of the predictors appear to be statistically significant? If so, which ones?
logit_mod1 <- glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume,
data = Weekly, family = binomial(link = "logit"))
summary(logit_mod1) ##
## Call:
## glm(formula = Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 +
## Volume, family = binomial(link = "logit"), data = Weekly)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.6949 -1.2565 0.9913 1.0849 1.4579
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.26686 0.08593 3.106 0.0019 **
## Lag1 -0.04127 0.02641 -1.563 0.1181
## Lag2 0.05844 0.02686 2.175 0.0296 *
## Lag3 -0.01606 0.02666 -0.602 0.5469
## Lag4 -0.02779 0.02646 -1.050 0.2937
## Lag5 -0.01447 0.02638 -0.549 0.5833
## Volume -0.02274 0.03690 -0.616 0.5377
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1496.2 on 1088 degrees of freedom
## Residual deviance: 1486.4 on 1082 degrees of freedom
## AIC: 1500.4
##
## Number of Fisher Scoring iterations: 4
Con un nivel de significancia del 0.05 el unico predictor estadisticamente significativo es Lag2.
Compute the confusion matrix and overall fraction of correct predictions. Explain what the confusion matrix is telling you about the types of mistakes made by logistic regression.
# Resultados en probabilidades
predlogit_mod1 <- predict(logit_mod1, type = "response")
contrasts(Weekly$Direction)## Up
## Down 0
## Up 1
predlogit_mod1 <- ifelse(predlogit_mod1 > 0.5, yes = "Up", no = "Down")
# Calculo de la matrix de confusion y porcentaje de predicciones correctas
confusion <- function(prediction, data, cutoff=0.5){
real <- as.character(data$Direction)
a <- sum(real=="Down" & prediction=="Down") # correct positive prediction
b <- sum(real=="Up" & prediction=="Down") # incorrect positive prediction
c <- sum(real=="Down" & prediction=="Up") # correct negative prediction
d <- sum(real=="Up" & prediction=="Up") # incorrect negative prediction
CM <- data.frame("1" = c("Prediction","Down","Up"),
"2" = c("Down", a, c),
"3" = c("Up", b, d))
names(CM) <- c("", "Real", "")
ACC <- (a + d) / (a + b + c + d) # = 1 - err
ERR <- (b + c) / (a + b + c + d) # = 1 - acc
return(list(confusion_matrix = CM,
accuracy = ACC,
error = ERR))
}
confusion(predlogit_mod1, Weekly)## $confusion_matrix
## Real
## 1 Prediction Down Up
## 2 Down 54 48
## 3 Up 430 557
##
## $accuracy
## [1] 0.5610652
##
## $error
## [1] 0.4389348
-la cantidad de positivos que fueron clasificados correctamente como positivos por el modelo son 54
-la cantidad de negativos que fueron clasificados correctamente como negativos por el modelo son 48.
-la cantidad de positivos que fueron clasificados incorrectamente como negativos son 430.
-la cantidad de negativos que fueron clasificados incorrectamente como positivos son 557.
Tenemos una exactitud equivalente a 0.56, dicha exactitud se refiere a la dispersion del conjunto de valores obtenidos a partir de mediciones repetidas. Es asi como el el modelo predijo correctamente en un 56%. El porcentaje de la base de datos clasifica incorrectamente equivale a 0.44.
Now fit the logistic regression model using a training data period from 1990 to 2008, with Lag2 as the only predictor. Compute the confusion matrix and the overall fraction of correct predictions for the held out data (that is, the data from 2009 and 2010).
test <- c(which(Weekly$Year == 2009), which(Weekly$Year == 2010))
train <- Weekly[-test, ]
test <- Weekly[test, ]
# Solo Lag2 con funcion de activacion: logistica
logit_mod <- glm(Direction ~ Lag2,
data = train, family = binomial(link = "logit"))
# Resumen del modelo
summary(logit_mod)##
## Call:
## glm(formula = Direction ~ Lag2, family = binomial(link = "logit"),
## data = train)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.536 -1.264 1.021 1.091 1.368
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.20326 0.06428 3.162 0.00157 **
## Lag2 0.05810 0.02870 2.024 0.04298 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1354.7 on 984 degrees of freedom
## Residual deviance: 1350.5 on 983 degrees of freedom
## AIC: 1354.5
##
## Number of Fisher Scoring iterations: 4
# Resultados en probabilidades
predlogit_mod <- predict(logit_mod, newdata = test, type = "response")
predlogit_mod <- ifelse(predlogit_mod > 0.5, yes = "Up", no = "Down")
confusion(predlogit_mod, test) # En el 62.5% de los casos la regresion predijo bien## $confusion_matrix
## Real
## 1 Prediction Down Up
## 2 Down 9 5
## 3 Up 34 56
##
## $accuracy
## [1] 0.625
##
## $error
## [1] 0.375
-la cantidad de positivos que fueron clasificados correctamente como positivos por el modelo son 9
-la cantidad de negativos que fueron clasificados correctamente como negativos por el modelo son 5.
-la cantidad de positivos que fueron clasificados incorrectamente como negativos son 34.
-la cantidad de negativos que fueron clasificados incorrectamente como positivos son 56.
La regresion entrega muchos falsos negativos, es decir, en muchas ocasiones clasifica como Up a una observacion que debe ser Down
Tenemos una exactitud equivalente a 0.625, dicha exactitud se refiere a la dispersion del conjunto de valores obtenidos a partir de mediciones repetidas. Es asi como el el modelo predijo correctamente en un 62.5%. El porcentaje de la base de datos clasifica incorrectamente equivale a 0.375.
Repeat (d) using LDA
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
lda_mod <- lda(Direction ~ Lag2,
data = train)
predlda_mod <- as.character(predict(lda_mod, test)$class)
confusion(predlda_mod, test) # En el 62.5% de los casos la regresión predijo bien## $confusion_matrix
## Real
## 1 Prediction Down Up
## 2 Down 9 5
## 3 Up 34 56
##
## $accuracy
## [1] 0.625
##
## $error
## [1] 0.375
Se obtienen los mismos resultados que en el item anterior.
Repeat (d) using QDA.
qda_mod <- qda(Direction ~ Lag2,
data = train)
predqda_mod <- as.character(predict(qda_mod, test)$class)
confusion(predqda_mod, test) # En el 58.7% de los casos la regresion predijo bien## $confusion_matrix
## Real
## 1 Prediction Down Up
## 2 Down 0 0
## 3 Up 43 61
##
## $accuracy
## [1] 0.5865385
##
## $error
## [1] 0.4134615
-la cantidad de positivos que fueron clasificados correctamente como positivos por el modelo son 0
-la cantidad de negativos que fueron clasificados correctamente como negativos por el modelo son 0
-la cantidad de positivos que fueron clasificados incorrectamente como negativos son 43.
-la cantidad de negativos que fueron clasificados incorrectamente como positivos son 61.
Tenemos una exactitud equivalente a 0.5865 y eñ porcentaje de la base de datos clasifica incorrectamente equivale a 0.413.
Repeat (d) using KNN with K =1.
library(class)
predknn_mod <- as.character(knn(train[, 3, drop = FALSE], test[, 3, drop = FALSE], cl = train$Direction, k = 1))
confusion(predknn_mod, test) # En el 50.0% de los casos la regresion predijo bien## $confusion_matrix
## Real
## 1 Prediction Down Up
## 2 Down 21 29
## 3 Up 22 32
##
## $accuracy
## [1] 0.5096154
##
## $error
## [1] 0.4903846
-la cantidad de positivos que fueron clasificados correctamente como positivos por el modelo son 21-
-la cantidad de negativos que fueron clasificados correctamente como negativos por el modelo son 29.
-la cantidad de positivos que fueron clasificados incorrectamente como negativos son 22.
-la cantidad de negativos que fueron clasificados incorrectamente como positivos son 32.
Tenemos una exactitud equivalente a 0.5096 y eñ porcentaje de la base de datos clasifica incorrectamente equivale a 0.4904.
Which of these methods appears to provide the best results on this data?
Los metodos que parecen proporcionar mejores resultados son Logit y LDA.
Experiment with different combinations of predictors, includ- ing possible transformations and interactions, for each of the methods. Report the variables, method, and associated confusion matrix that appears to provide the best results on the held out data. Note that you should also experiment with values for K in the KNN classifier.
# Probando Logit
logit_mod2 <- glm(Direction ~ 1,
data = train, family = binomial(link = "logit"))
# Resumen del modelo
summary(logit_mod2)##
## Call:
## glm(formula = Direction ~ 1, family = binomial(link = "logit"),
## data = train)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.268 -1.268 1.090 1.090 1.090
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.20990 0.06408 3.276 0.00105 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1354.7 on 984 degrees of freedom
## Residual deviance: 1354.7 on 984 degrees of freedom
## AIC: 1356.7
##
## Number of Fisher Scoring iterations: 3
# Resultados en probabilidades
predlogit_mod2 <- predict(logit_mod2, newdata = test, type = "response")
predlogit_mod2 <- ifelse(predlogit_mod2 > 0.5, yes = "Up", no = "Down")
confusion(predlogit_mod2, test) # En el 58.7% de los casos la regresión predijo bien## $confusion_matrix
## Real
## 1 Prediction Down Up
## 2 Down 0 0
## 3 Up 43 61
##
## $accuracy
## [1] 0.5865385
##
## $error
## [1] 0.4134615
Realizar un modelo Logit con solo el intercepto produce los mismo resultados que un QDA con Lag2 como unico regresor.
Probando LDA y QDA Al añadir mas variables en LDA y QDA se consigue sobreajustar el modelo y por lo mismo se disminuye la precision de las estimaciones.
# Probando KNN
predknn_mod <- as.character(knn(train[, 3, drop = FALSE], test[, 3, drop = FALSE], cl = train$Direction, k = 3))
confusion(predknn_mod, test) # En el 54.8% de los casos la regresion predijo bien## $confusion_matrix
## Real
## 1 Prediction Down Up
## 2 Down 16 20
## 3 Up 27 41
##
## $accuracy
## [1] 0.5480769
##
## $error
## [1] 0.4519231
Con tres grupos (k=3) se deja de aumentar la precision de las predicciones, entonces dicho k es el mas adecuado para obtener predicciones mas correctas.
In this problem, you will develop a model to predict whether a given car gets high or low gas mileage based on the Auto data set.
Create a binary variable, mpg01, that contains a 1 if mpg contains a value above its median, and a 0 if mpg contains a value below its median. You can compute the median using the median() function. Note you may find it helpful to use the data.frame() function to create a single data set containing both mpg01 and the other Auto variables.
## 'data.frame': 392 obs. of 9 variables:
## $ mpg : num 18 15 18 16 17 15 14 14 14 15 ...
## $ cylinders : num 8 8 8 8 8 8 8 8 8 8 ...
## $ displacement: num 307 350 318 304 302 429 454 440 455 390 ...
## $ horsepower : num 130 165 150 150 140 198 220 215 225 190 ...
## $ weight : num 3504 3693 3436 3433 3449 ...
## $ acceleration: num 12 11.5 11 12 10.5 10 9 8.5 10 8.5 ...
## $ year : num 70 70 70 70 70 70 70 70 70 70 ...
## $ origin : num 1 1 1 1 1 1 1 1 1 1 ...
## $ name : Factor w/ 304 levels "amc ambassador brougham",..: 49 36 231 14 161 141 54 223 241 2 ...
poseemos una base de datos de 9 variables y 392 observaciones, sus variables son numericas a excepcion de la variable ‘name’
mediana <- median(Auto$mpg)
mpg01 <- ifelse(Auto$mpg > mediana, yes = 1, no = 0)
newAuto <- cbind(mpg01, Auto[, -1])
summary(newAuto)## mpg01 cylinders displacement horsepower weight
## Min. :0.0 Min. :3.000 Min. : 68.0 Min. : 46.0 Min. :1613
## 1st Qu.:0.0 1st Qu.:4.000 1st Qu.:105.0 1st Qu.: 75.0 1st Qu.:2225
## Median :0.5 Median :4.000 Median :151.0 Median : 93.5 Median :2804
## Mean :0.5 Mean :5.472 Mean :194.4 Mean :104.5 Mean :2978
## 3rd Qu.:1.0 3rd Qu.:8.000 3rd Qu.:275.8 3rd Qu.:126.0 3rd Qu.:3615
## Max. :1.0 Max. :8.000 Max. :455.0 Max. :230.0 Max. :5140
##
## acceleration year origin name
## Min. : 8.00 Min. :70.00 Min. :1.000 amc matador : 5
## 1st Qu.:13.78 1st Qu.:73.00 1st Qu.:1.000 ford pinto : 5
## Median :15.50 Median :76.00 Median :1.000 toyota corolla : 5
## Mean :15.54 Mean :75.98 Mean :1.577 amc gremlin : 4
## 3rd Qu.:17.02 3rd Qu.:79.00 3rd Qu.:2.000 amc hornet : 4
## Max. :24.80 Max. :82.00 Max. :3.000 chevrolet chevette: 4
## (Other) :365
En el resumen anterior logramos visualizar los valores minimos y maximos de cada una de las variables, ademas tambien podemos visualizar su media y mediana.
## Loading required package: xts
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
##
## Attaching package: 'xts'
## The following objects are masked from 'package:dplyr':
##
## first, last
##
## Attaching package: 'PerformanceAnalytics'
## The following object is masked from 'package:graphics':
##
## legend
Explore the data graphically in order to investigate the association between mpg01 and the other features. Which of the other features seem most likely to be useful in predicting mpg01?Scatterplots and boxplots may be useful tools to answer this question. Describe your findings.
# Para no violar la normalidad se usan las covariables sin transformar a categoricas
newAuto$mpg01 <- as.factor(as.character(newAuto$mpg01))
plot(cylinders ~ mpg01, data = newAuto)Los boxplot los usamos para ver le comporatemiento de la variable respuesta, mirando el diagrama de cajas se puede observar que las medias parecen ser significativamente diferentes, es así que sería interesante incluir estas medidas.
Split the data into a training set and a test set.
conjunto de entrenamiento y de verificacion
set.seed(2020)
test <- sample(1:nrow(newAuto), size = nrow(newAuto)*0.7, replace = FALSE)
# Conjunto de entrenamiento
train <- newAuto[test, ]
# Conjunto de verificacion
test <- newAuto[-test, ]Cirlculo de la matriz de confusion y porcentaje de predicciones correctas
confusion <- function(prediction, data){
real <- data$mpg01
a <- sum(real=="0" & prediction=="0") # correct positive prediction
b <- sum(real=="1" & prediction=="0") # incorrect positive prediction
c <- sum(real=="0" & prediction=="1") # correct negative prediction
d <- sum(real=="1" & prediction=="1") # incorrect negative prediction
CM <- data.frame("1" = c("Prediction","0","1"),
"2" = c("0", a, c),
"3" = c("1", b, d))
names(CM) <- c("", "Real", "")
ACC <- (a + d) / (a + b + c + d) # = 1 - err
ERR <- (b + c) / (a + b + c + d) # = 1 - acc
return(list(confusion_matrix = CM,
accuracy = ACC,
error = ERR))
}Perform LDA on the training data in order to predict mpg01 using the variables that seemed most associated with mpg01 in (b). What is the test error of the model obtained?
library (MASS)
lda_mod <- lda(mpg01 ~ cylinders + displacement + horsepower + weight + year,
data = train)
predlda_mod <- as.character(predict(lda_mod, test)$class)
confusion(predlda_mod, test) ## $confusion_matrix
## Real
## 1 Prediction 0 1
## 2 0 48 3
## 3 1 8 59
##
## $accuracy
## [1] 0.9067797
##
## $error
## [1] 0.09322034
El error de prueba es del 9.3%
Perform QDA on the training data in order to predict mpg01 using the variables that seemed most associated with mpg01 in (b). What is the test error of the model obtained?
qda_mod <- qda(mpg01 ~ cylinders + displacement + horsepower + weight + year,
data = train)
predqda_mod <- as.character(predict(qda_mod, test)$class)
confusion(predqda_mod, test) ## $confusion_matrix
## Real
## 1 Prediction 0 1
## 2 0 49 12
## 3 1 7 50
##
## $accuracy
## [1] 0.8389831
##
## $error
## [1] 0.1610169
El error de prueba es del 16.1%
Perform logistic regression on the training data in order to predict mpg01 using the variables that seemed most associated with mpg01 in (b). What is the test error of the model obtained?
logit_mod <- glm(mpg01 ~ cylinders + displacement + horsepower + weight + year,
data = train, family = binomial(link = "logit"))
# Resumen del modelo
summary(logit_mod)##
## Call:
## glm(formula = mpg01 ~ cylinders + displacement + horsepower +
## weight + year, family = binomial(link = "logit"), data = train)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.3105 -0.1464 -0.0013 0.2369 3.2283
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -12.513987 6.350907 -1.970 0.04879 *
## cylinders -0.592639 0.558038 -1.062 0.28823
## displacement 0.006059 0.012949 0.468 0.63985
## horsepower -0.022678 0.019656 -1.154 0.24861
## weight -0.004536 0.001191 -3.807 0.00014 ***
## year 0.385479 0.092263 4.178 2.94e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 379.71 on 273 degrees of freedom
## Residual deviance: 104.34 on 268 degrees of freedom
## AIC: 116.34
##
## Number of Fisher Scoring iterations: 7
# Resultados en probabilidades
predlogit_mod <- predict(logit_mod, newdata = test, type = "response")
predlogit_mod <- ifelse(predlogit_mod > 0.5, yes = "1", no = "0")
confusion(predlogit_mod, test) ## $confusion_matrix
## Real
## 1 Prediction 0 1
## 2 0 47 6
## 3 1 9 56
##
## $accuracy
## [1] 0.8728814
##
## $error
## [1] 0.1271186
El error de prueba es del 12.7%
Perform KNN on the training data, with several values of K ,in order to predict mpg01. Use only the variables that seemed most associated with mpg01 in (b). What test errors do you obtain? Which value of K seems to perform the best on this data set?
library(class)
predknn_mod <- as.character(knn(train[, -c(1, 6, 8, 9), drop = FALSE], test[, -c(1, 6, 8, 9), drop = FALSE], cl = train$mpg01, k = 3))
confusion(predknn_mod, test) ## $confusion_matrix
## Real
## 1 Prediction 0 1
## 2 0 46 12
## 3 1 10 50
##
## $accuracy
## [1] 0.8135593
##
## $error
## [1] 0.1864407
El error es del 18.6% ,con K=3
This problem involves writing functions
Write a function, Power(), that prints out the result of raising 2 to the 3rd power. In other words, your function should compute 2^3 and print out the results. Hint: Recal l that x^a raises x to the power a.Usetheprint() function to output the result.
## [1] 8
Create a new function, Power2(), that allows you to pass any two numbers, x and a, and prints out the value of x^a.Youcan do this by beginning your function with the line Power2=function (x,a){ You should be able to call your function by entering, for instance, Power2(3,8) on the command line. This should output the value of 3^8,namely, 6, 561.
## [1] 6561
Using the Power2() function that you just wrote, compute 103,81, and 131^3
## [1] 1000
## [1] 2.2518e+15
## [1] 2248091
Now create a new function, Power3(), that actually returns the result x^a as an R object, rather than simply printing it to the screen. That is, if you store the value x^a in an object called result within your function, then you can simply return() this result, using the following line: return() return(result) The line above should be the last line in your function, before the } symbol.
Now using the Power3() function, create a plot of f (x)=x^2. The x-axis should display a range of integers from 1 to 10, and the y-axis should display x^2. Label the axes appropriately, and use an appropriate title for the figure. Consider displaying either the x-axis, the y-axis, or both on the log-scale. You can do this by using log=‘‘x’’, log=‘‘y’’,orlog=‘‘xy’’ as arguments to the plot() function.
x <- 1:10
y <- Power3(x, 2)
plot(x, y, pch = 19, las = 1, xaxt = "n",
main = bquote(f(x) == x^2 ~ with ~ 1 <= ~ x <= 10))
axis(1, at = 1:10, labels = 1:10)Create a function, PlotPower(), that allows you to create a plot of x against x^a for a fixed a and for a range of values of x.For instance, if you call PlotPower (1:10,3) then a plot should be created with an x-axis taking on values 1, 2,…,10, and a y-axis taking on values 13,23,…,10^3
PlotPower <- function(x, a){
y <- x^a
plot(x, y, pch = 19, las = 1, xaxt = "n", yaxt = "n",
main = bquote(f(x) == x^ ~ .(a) ~ with ~ .(x[1]) <= ~ x <= .(x[length(x)])))
axis(1, at = x, labels = x)
axis(2, at = x^a, labels = x^a, las = 1)
}
PlotPower(1:10, 3)In the lab, we applied random forests to the Boston data using mtry=6 and using ntree=25 and ntree=500. Create a plot displaying the test error resulting from random forests on this data set for a more comprehensive range of values for mtry and ntree. You can model your plot after Figure 8.10. Describe the results obtained.
## crim zn indus chas nox rm age dis rad tax ptratio black
## 1 0.00632 18.0 2.31 0 0.5380 6.575 65.2 4.0900 1 296 15.3 396.90
## 2 0.02731 0.0 7.07 0 0.4690 6.421 78.9 4.9671 2 242 17.8 396.90
## 3 0.02729 0.0 7.07 0 0.4690 7.185 61.1 4.9671 2 242 17.8 392.83
## 4 0.03237 0.0 2.18 0 0.4580 6.998 45.8 6.0622 3 222 18.7 394.63
## 5 0.06905 0.0 2.18 0 0.4580 7.147 54.2 6.0622 3 222 18.7 396.90
## 6 0.02985 0.0 2.18 0 0.4580 6.430 58.7 6.0622 3 222 18.7 394.12
## 7 0.08829 12.5 7.87 0 0.5240 6.012 66.6 5.5605 5 311 15.2 395.60
## 8 0.14455 12.5 7.87 0 0.5240 6.172 96.1 5.9505 5 311 15.2 396.90
## 9 0.21124 12.5 7.87 0 0.5240 5.631 100.0 6.0821 5 311 15.2 386.63
## 10 0.17004 12.5 7.87 0 0.5240 6.004 85.9 6.5921 5 311 15.2 386.71
## 11 0.22489 12.5 7.87 0 0.5240 6.377 94.3 6.3467 5 311 15.2 392.52
## 12 0.11747 12.5 7.87 0 0.5240 6.009 82.9 6.2267 5 311 15.2 396.90
## 13 0.09378 12.5 7.87 0 0.5240 5.889 39.0 5.4509 5 311 15.2 390.50
## 14 0.62976 0.0 8.14 0 0.5380 5.949 61.8 4.7075 4 307 21.0 396.90
## 15 0.63796 0.0 8.14 0 0.5380 6.096 84.5 4.4619 4 307 21.0 380.02
## 16 0.62739 0.0 8.14 0 0.5380 5.834 56.5 4.4986 4 307 21.0 395.62
## 17 1.05393 0.0 8.14 0 0.5380 5.935 29.3 4.4986 4 307 21.0 386.85
## 18 0.78420 0.0 8.14 0 0.5380 5.990 81.7 4.2579 4 307 21.0 386.75
## 19 0.80271 0.0 8.14 0 0.5380 5.456 36.6 3.7965 4 307 21.0 288.99
## 20 0.72580 0.0 8.14 0 0.5380 5.727 69.5 3.7965 4 307 21.0 390.95
## 21 1.25179 0.0 8.14 0 0.5380 5.570 98.1 3.7979 4 307 21.0 376.57
## 22 0.85204 0.0 8.14 0 0.5380 5.965 89.2 4.0123 4 307 21.0 392.53
## 23 1.23247 0.0 8.14 0 0.5380 6.142 91.7 3.9769 4 307 21.0 396.90
## 24 0.98843 0.0 8.14 0 0.5380 5.813 100.0 4.0952 4 307 21.0 394.54
## 25 0.75026 0.0 8.14 0 0.5380 5.924 94.1 4.3996 4 307 21.0 394.33
## 26 0.84054 0.0 8.14 0 0.5380 5.599 85.7 4.4546 4 307 21.0 303.42
## 27 0.67191 0.0 8.14 0 0.5380 5.813 90.3 4.6820 4 307 21.0 376.88
## 28 0.95577 0.0 8.14 0 0.5380 6.047 88.8 4.4534 4 307 21.0 306.38
## 29 0.77299 0.0 8.14 0 0.5380 6.495 94.4 4.4547 4 307 21.0 387.94
## 30 1.00245 0.0 8.14 0 0.5380 6.674 87.3 4.2390 4 307 21.0 380.23
## 31 1.13081 0.0 8.14 0 0.5380 5.713 94.1 4.2330 4 307 21.0 360.17
## 32 1.35472 0.0 8.14 0 0.5380 6.072 100.0 4.1750 4 307 21.0 376.73
## 33 1.38799 0.0 8.14 0 0.5380 5.950 82.0 3.9900 4 307 21.0 232.60
## 34 1.15172 0.0 8.14 0 0.5380 5.701 95.0 3.7872 4 307 21.0 358.77
## 35 1.61282 0.0 8.14 0 0.5380 6.096 96.9 3.7598 4 307 21.0 248.31
## 36 0.06417 0.0 5.96 0 0.4990 5.933 68.2 3.3603 5 279 19.2 396.90
## 37 0.09744 0.0 5.96 0 0.4990 5.841 61.4 3.3779 5 279 19.2 377.56
## 38 0.08014 0.0 5.96 0 0.4990 5.850 41.5 3.9342 5 279 19.2 396.90
## 39 0.17505 0.0 5.96 0 0.4990 5.966 30.2 3.8473 5 279 19.2 393.43
## 40 0.02763 75.0 2.95 0 0.4280 6.595 21.8 5.4011 3 252 18.3 395.63
## 41 0.03359 75.0 2.95 0 0.4280 7.024 15.8 5.4011 3 252 18.3 395.62
## 42 0.12744 0.0 6.91 0 0.4480 6.770 2.9 5.7209 3 233 17.9 385.41
## 43 0.14150 0.0 6.91 0 0.4480 6.169 6.6 5.7209 3 233 17.9 383.37
## 44 0.15936 0.0 6.91 0 0.4480 6.211 6.5 5.7209 3 233 17.9 394.46
## 45 0.12269 0.0 6.91 0 0.4480 6.069 40.0 5.7209 3 233 17.9 389.39
## 46 0.17142 0.0 6.91 0 0.4480 5.682 33.8 5.1004 3 233 17.9 396.90
## 47 0.18836 0.0 6.91 0 0.4480 5.786 33.3 5.1004 3 233 17.9 396.90
## 48 0.22927 0.0 6.91 0 0.4480 6.030 85.5 5.6894 3 233 17.9 392.74
## 49 0.25387 0.0 6.91 0 0.4480 5.399 95.3 5.8700 3 233 17.9 396.90
## 50 0.21977 0.0 6.91 0 0.4480 5.602 62.0 6.0877 3 233 17.9 396.90
## 51 0.08873 21.0 5.64 0 0.4390 5.963 45.7 6.8147 4 243 16.8 395.56
## 52 0.04337 21.0 5.64 0 0.4390 6.115 63.0 6.8147 4 243 16.8 393.97
## 53 0.05360 21.0 5.64 0 0.4390 6.511 21.1 6.8147 4 243 16.8 396.90
## 54 0.04981 21.0 5.64 0 0.4390 5.998 21.4 6.8147 4 243 16.8 396.90
## 55 0.01360 75.0 4.00 0 0.4100 5.888 47.6 7.3197 3 469 21.1 396.90
## 56 0.01311 90.0 1.22 0 0.4030 7.249 21.9 8.6966 5 226 17.9 395.93
## 57 0.02055 85.0 0.74 0 0.4100 6.383 35.7 9.1876 2 313 17.3 396.90
## 58 0.01432 100.0 1.32 0 0.4110 6.816 40.5 8.3248 5 256 15.1 392.90
## 59 0.15445 25.0 5.13 0 0.4530 6.145 29.2 7.8148 8 284 19.7 390.68
## 60 0.10328 25.0 5.13 0 0.4530 5.927 47.2 6.9320 8 284 19.7 396.90
## 61 0.14932 25.0 5.13 0 0.4530 5.741 66.2 7.2254 8 284 19.7 395.11
## 62 0.17171 25.0 5.13 0 0.4530 5.966 93.4 6.8185 8 284 19.7 378.08
## 63 0.11027 25.0 5.13 0 0.4530 6.456 67.8 7.2255 8 284 19.7 396.90
## 64 0.12650 25.0 5.13 0 0.4530 6.762 43.4 7.9809 8 284 19.7 395.58
## 65 0.01951 17.5 1.38 0 0.4161 7.104 59.5 9.2229 3 216 18.6 393.24
## 66 0.03584 80.0 3.37 0 0.3980 6.290 17.8 6.6115 4 337 16.1 396.90
## 67 0.04379 80.0 3.37 0 0.3980 5.787 31.1 6.6115 4 337 16.1 396.90
## 68 0.05789 12.5 6.07 0 0.4090 5.878 21.4 6.4980 4 345 18.9 396.21
## 69 0.13554 12.5 6.07 0 0.4090 5.594 36.8 6.4980 4 345 18.9 396.90
## 70 0.12816 12.5 6.07 0 0.4090 5.885 33.0 6.4980 4 345 18.9 396.90
## 71 0.08826 0.0 10.81 0 0.4130 6.417 6.6 5.2873 4 305 19.2 383.73
## 72 0.15876 0.0 10.81 0 0.4130 5.961 17.5 5.2873 4 305 19.2 376.94
## 73 0.09164 0.0 10.81 0 0.4130 6.065 7.8 5.2873 4 305 19.2 390.91
## 74 0.19539 0.0 10.81 0 0.4130 6.245 6.2 5.2873 4 305 19.2 377.17
## 75 0.07896 0.0 12.83 0 0.4370 6.273 6.0 4.2515 5 398 18.7 394.92
## 76 0.09512 0.0 12.83 0 0.4370 6.286 45.0 4.5026 5 398 18.7 383.23
## 77 0.10153 0.0 12.83 0 0.4370 6.279 74.5 4.0522 5 398 18.7 373.66
## 78 0.08707 0.0 12.83 0 0.4370 6.140 45.8 4.0905 5 398 18.7 386.96
## 79 0.05646 0.0 12.83 0 0.4370 6.232 53.7 5.0141 5 398 18.7 386.40
## 80 0.08387 0.0 12.83 0 0.4370 5.874 36.6 4.5026 5 398 18.7 396.06
## 81 0.04113 25.0 4.86 0 0.4260 6.727 33.5 5.4007 4 281 19.0 396.90
## 82 0.04462 25.0 4.86 0 0.4260 6.619 70.4 5.4007 4 281 19.0 395.63
## 83 0.03659 25.0 4.86 0 0.4260 6.302 32.2 5.4007 4 281 19.0 396.90
## 84 0.03551 25.0 4.86 0 0.4260 6.167 46.7 5.4007 4 281 19.0 390.64
## 85 0.05059 0.0 4.49 0 0.4490 6.389 48.0 4.7794 3 247 18.5 396.90
## 86 0.05735 0.0 4.49 0 0.4490 6.630 56.1 4.4377 3 247 18.5 392.30
## 87 0.05188 0.0 4.49 0 0.4490 6.015 45.1 4.4272 3 247 18.5 395.99
## 88 0.07151 0.0 4.49 0 0.4490 6.121 56.8 3.7476 3 247 18.5 395.15
## 89 0.05660 0.0 3.41 0 0.4890 7.007 86.3 3.4217 2 270 17.8 396.90
## 90 0.05302 0.0 3.41 0 0.4890 7.079 63.1 3.4145 2 270 17.8 396.06
## 91 0.04684 0.0 3.41 0 0.4890 6.417 66.1 3.0923 2 270 17.8 392.18
## 92 0.03932 0.0 3.41 0 0.4890 6.405 73.9 3.0921 2 270 17.8 393.55
## 93 0.04203 28.0 15.04 0 0.4640 6.442 53.6 3.6659 4 270 18.2 395.01
## 94 0.02875 28.0 15.04 0 0.4640 6.211 28.9 3.6659 4 270 18.2 396.33
## 95 0.04294 28.0 15.04 0 0.4640 6.249 77.3 3.6150 4 270 18.2 396.90
## 96 0.12204 0.0 2.89 0 0.4450 6.625 57.8 3.4952 2 276 18.0 357.98
## 97 0.11504 0.0 2.89 0 0.4450 6.163 69.6 3.4952 2 276 18.0 391.83
## 98 0.12083 0.0 2.89 0 0.4450 8.069 76.0 3.4952 2 276 18.0 396.90
## 99 0.08187 0.0 2.89 0 0.4450 7.820 36.9 3.4952 2 276 18.0 393.53
## 100 0.06860 0.0 2.89 0 0.4450 7.416 62.5 3.4952 2 276 18.0 396.90
## 101 0.14866 0.0 8.56 0 0.5200 6.727 79.9 2.7778 5 384 20.9 394.76
## 102 0.11432 0.0 8.56 0 0.5200 6.781 71.3 2.8561 5 384 20.9 395.58
## 103 0.22876 0.0 8.56 0 0.5200 6.405 85.4 2.7147 5 384 20.9 70.80
## 104 0.21161 0.0 8.56 0 0.5200 6.137 87.4 2.7147 5 384 20.9 394.47
## 105 0.13960 0.0 8.56 0 0.5200 6.167 90.0 2.4210 5 384 20.9 392.69
## 106 0.13262 0.0 8.56 0 0.5200 5.851 96.7 2.1069 5 384 20.9 394.05
## 107 0.17120 0.0 8.56 0 0.5200 5.836 91.9 2.2110 5 384 20.9 395.67
## 108 0.13117 0.0 8.56 0 0.5200 6.127 85.2 2.1224 5 384 20.9 387.69
## 109 0.12802 0.0 8.56 0 0.5200 6.474 97.1 2.4329 5 384 20.9 395.24
## 110 0.26363 0.0 8.56 0 0.5200 6.229 91.2 2.5451 5 384 20.9 391.23
## 111 0.10793 0.0 8.56 0 0.5200 6.195 54.4 2.7778 5 384 20.9 393.49
## 112 0.10084 0.0 10.01 0 0.5470 6.715 81.6 2.6775 6 432 17.8 395.59
## 113 0.12329 0.0 10.01 0 0.5470 5.913 92.9 2.3534 6 432 17.8 394.95
## 114 0.22212 0.0 10.01 0 0.5470 6.092 95.4 2.5480 6 432 17.8 396.90
## 115 0.14231 0.0 10.01 0 0.5470 6.254 84.2 2.2565 6 432 17.8 388.74
## 116 0.17134 0.0 10.01 0 0.5470 5.928 88.2 2.4631 6 432 17.8 344.91
## 117 0.13158 0.0 10.01 0 0.5470 6.176 72.5 2.7301 6 432 17.8 393.30
## 118 0.15098 0.0 10.01 0 0.5470 6.021 82.6 2.7474 6 432 17.8 394.51
## 119 0.13058 0.0 10.01 0 0.5470 5.872 73.1 2.4775 6 432 17.8 338.63
## 120 0.14476 0.0 10.01 0 0.5470 5.731 65.2 2.7592 6 432 17.8 391.50
## 121 0.06899 0.0 25.65 0 0.5810 5.870 69.7 2.2577 2 188 19.1 389.15
## 122 0.07165 0.0 25.65 0 0.5810 6.004 84.1 2.1974 2 188 19.1 377.67
## 123 0.09299 0.0 25.65 0 0.5810 5.961 92.9 2.0869 2 188 19.1 378.09
## 124 0.15038 0.0 25.65 0 0.5810 5.856 97.0 1.9444 2 188 19.1 370.31
## 125 0.09849 0.0 25.65 0 0.5810 5.879 95.8 2.0063 2 188 19.1 379.38
## 126 0.16902 0.0 25.65 0 0.5810 5.986 88.4 1.9929 2 188 19.1 385.02
## 127 0.38735 0.0 25.65 0 0.5810 5.613 95.6 1.7572 2 188 19.1 359.29
## 128 0.25915 0.0 21.89 0 0.6240 5.693 96.0 1.7883 4 437 21.2 392.11
## 129 0.32543 0.0 21.89 0 0.6240 6.431 98.8 1.8125 4 437 21.2 396.90
## 130 0.88125 0.0 21.89 0 0.6240 5.637 94.7 1.9799 4 437 21.2 396.90
## 131 0.34006 0.0 21.89 0 0.6240 6.458 98.9 2.1185 4 437 21.2 395.04
## 132 1.19294 0.0 21.89 0 0.6240 6.326 97.7 2.2710 4 437 21.2 396.90
## 133 0.59005 0.0 21.89 0 0.6240 6.372 97.9 2.3274 4 437 21.2 385.76
## 134 0.32982 0.0 21.89 0 0.6240 5.822 95.4 2.4699 4 437 21.2 388.69
## 135 0.97617 0.0 21.89 0 0.6240 5.757 98.4 2.3460 4 437 21.2 262.76
## 136 0.55778 0.0 21.89 0 0.6240 6.335 98.2 2.1107 4 437 21.2 394.67
## 137 0.32264 0.0 21.89 0 0.6240 5.942 93.5 1.9669 4 437 21.2 378.25
## 138 0.35233 0.0 21.89 0 0.6240 6.454 98.4 1.8498 4 437 21.2 394.08
## 139 0.24980 0.0 21.89 0 0.6240 5.857 98.2 1.6686 4 437 21.2 392.04
## 140 0.54452 0.0 21.89 0 0.6240 6.151 97.9 1.6687 4 437 21.2 396.90
## 141 0.29090 0.0 21.89 0 0.6240 6.174 93.6 1.6119 4 437 21.2 388.08
## 142 1.62864 0.0 21.89 0 0.6240 5.019 100.0 1.4394 4 437 21.2 396.90
## 143 3.32105 0.0 19.58 1 0.8710 5.403 100.0 1.3216 5 403 14.7 396.90
## 144 4.09740 0.0 19.58 0 0.8710 5.468 100.0 1.4118 5 403 14.7 396.90
## 145 2.77974 0.0 19.58 0 0.8710 4.903 97.8 1.3459 5 403 14.7 396.90
## 146 2.37934 0.0 19.58 0 0.8710 6.130 100.0 1.4191 5 403 14.7 172.91
## 147 2.15505 0.0 19.58 0 0.8710 5.628 100.0 1.5166 5 403 14.7 169.27
## 148 2.36862 0.0 19.58 0 0.8710 4.926 95.7 1.4608 5 403 14.7 391.71
## 149 2.33099 0.0 19.58 0 0.8710 5.186 93.8 1.5296 5 403 14.7 356.99
## 150 2.73397 0.0 19.58 0 0.8710 5.597 94.9 1.5257 5 403 14.7 351.85
## 151 1.65660 0.0 19.58 0 0.8710 6.122 97.3 1.6180 5 403 14.7 372.80
## 152 1.49632 0.0 19.58 0 0.8710 5.404 100.0 1.5916 5 403 14.7 341.60
## 153 1.12658 0.0 19.58 1 0.8710 5.012 88.0 1.6102 5 403 14.7 343.28
## 154 2.14918 0.0 19.58 0 0.8710 5.709 98.5 1.6232 5 403 14.7 261.95
## 155 1.41385 0.0 19.58 1 0.8710 6.129 96.0 1.7494 5 403 14.7 321.02
## 156 3.53501 0.0 19.58 1 0.8710 6.152 82.6 1.7455 5 403 14.7 88.01
## 157 2.44668 0.0 19.58 0 0.8710 5.272 94.0 1.7364 5 403 14.7 88.63
## 158 1.22358 0.0 19.58 0 0.6050 6.943 97.4 1.8773 5 403 14.7 363.43
## 159 1.34284 0.0 19.58 0 0.6050 6.066 100.0 1.7573 5 403 14.7 353.89
## 160 1.42502 0.0 19.58 0 0.8710 6.510 100.0 1.7659 5 403 14.7 364.31
## 161 1.27346 0.0 19.58 1 0.6050 6.250 92.6 1.7984 5 403 14.7 338.92
## 162 1.46336 0.0 19.58 0 0.6050 7.489 90.8 1.9709 5 403 14.7 374.43
## 163 1.83377 0.0 19.58 1 0.6050 7.802 98.2 2.0407 5 403 14.7 389.61
## 164 1.51902 0.0 19.58 1 0.6050 8.375 93.9 2.1620 5 403 14.7 388.45
## 165 2.24236 0.0 19.58 0 0.6050 5.854 91.8 2.4220 5 403 14.7 395.11
## 166 2.92400 0.0 19.58 0 0.6050 6.101 93.0 2.2834 5 403 14.7 240.16
## 167 2.01019 0.0 19.58 0 0.6050 7.929 96.2 2.0459 5 403 14.7 369.30
## 168 1.80028 0.0 19.58 0 0.6050 5.877 79.2 2.4259 5 403 14.7 227.61
## 169 2.30040 0.0 19.58 0 0.6050 6.319 96.1 2.1000 5 403 14.7 297.09
## 170 2.44953 0.0 19.58 0 0.6050 6.402 95.2 2.2625 5 403 14.7 330.04
## 171 1.20742 0.0 19.58 0 0.6050 5.875 94.6 2.4259 5 403 14.7 292.29
## 172 2.31390 0.0 19.58 0 0.6050 5.880 97.3 2.3887 5 403 14.7 348.13
## 173 0.13914 0.0 4.05 0 0.5100 5.572 88.5 2.5961 5 296 16.6 396.90
## 174 0.09178 0.0 4.05 0 0.5100 6.416 84.1 2.6463 5 296 16.6 395.50
## 175 0.08447 0.0 4.05 0 0.5100 5.859 68.7 2.7019 5 296 16.6 393.23
## 176 0.06664 0.0 4.05 0 0.5100 6.546 33.1 3.1323 5 296 16.6 390.96
## 177 0.07022 0.0 4.05 0 0.5100 6.020 47.2 3.5549 5 296 16.6 393.23
## 178 0.05425 0.0 4.05 0 0.5100 6.315 73.4 3.3175 5 296 16.6 395.60
## 179 0.06642 0.0 4.05 0 0.5100 6.860 74.4 2.9153 5 296 16.6 391.27
## 180 0.05780 0.0 2.46 0 0.4880 6.980 58.4 2.8290 3 193 17.8 396.90
## 181 0.06588 0.0 2.46 0 0.4880 7.765 83.3 2.7410 3 193 17.8 395.56
## 182 0.06888 0.0 2.46 0 0.4880 6.144 62.2 2.5979 3 193 17.8 396.90
## 183 0.09103 0.0 2.46 0 0.4880 7.155 92.2 2.7006 3 193 17.8 394.12
## 184 0.10008 0.0 2.46 0 0.4880 6.563 95.6 2.8470 3 193 17.8 396.90
## 185 0.08308 0.0 2.46 0 0.4880 5.604 89.8 2.9879 3 193 17.8 391.00
## 186 0.06047 0.0 2.46 0 0.4880 6.153 68.8 3.2797 3 193 17.8 387.11
## 187 0.05602 0.0 2.46 0 0.4880 7.831 53.6 3.1992 3 193 17.8 392.63
## 188 0.07875 45.0 3.44 0 0.4370 6.782 41.1 3.7886 5 398 15.2 393.87
## 189 0.12579 45.0 3.44 0 0.4370 6.556 29.1 4.5667 5 398 15.2 382.84
## 190 0.08370 45.0 3.44 0 0.4370 7.185 38.9 4.5667 5 398 15.2 396.90
## 191 0.09068 45.0 3.44 0 0.4370 6.951 21.5 6.4798 5 398 15.2 377.68
## 192 0.06911 45.0 3.44 0 0.4370 6.739 30.8 6.4798 5 398 15.2 389.71
## 193 0.08664 45.0 3.44 0 0.4370 7.178 26.3 6.4798 5 398 15.2 390.49
## 194 0.02187 60.0 2.93 0 0.4010 6.800 9.9 6.2196 1 265 15.6 393.37
## 195 0.01439 60.0 2.93 0 0.4010 6.604 18.8 6.2196 1 265 15.6 376.70
## 196 0.01381 80.0 0.46 0 0.4220 7.875 32.0 5.6484 4 255 14.4 394.23
## 197 0.04011 80.0 1.52 0 0.4040 7.287 34.1 7.3090 2 329 12.6 396.90
## 198 0.04666 80.0 1.52 0 0.4040 7.107 36.6 7.3090 2 329 12.6 354.31
## 199 0.03768 80.0 1.52 0 0.4040 7.274 38.3 7.3090 2 329 12.6 392.20
## 200 0.03150 95.0 1.47 0 0.4030 6.975 15.3 7.6534 3 402 17.0 396.90
## 201 0.01778 95.0 1.47 0 0.4030 7.135 13.9 7.6534 3 402 17.0 384.30
## 202 0.03445 82.5 2.03 0 0.4150 6.162 38.4 6.2700 2 348 14.7 393.77
## 203 0.02177 82.5 2.03 0 0.4150 7.610 15.7 6.2700 2 348 14.7 395.38
## 204 0.03510 95.0 2.68 0 0.4161 7.853 33.2 5.1180 4 224 14.7 392.78
## 205 0.02009 95.0 2.68 0 0.4161 8.034 31.9 5.1180 4 224 14.7 390.55
## 206 0.13642 0.0 10.59 0 0.4890 5.891 22.3 3.9454 4 277 18.6 396.90
## 207 0.22969 0.0 10.59 0 0.4890 6.326 52.5 4.3549 4 277 18.6 394.87
## 208 0.25199 0.0 10.59 0 0.4890 5.783 72.7 4.3549 4 277 18.6 389.43
## 209 0.13587 0.0 10.59 1 0.4890 6.064 59.1 4.2392 4 277 18.6 381.32
## 210 0.43571 0.0 10.59 1 0.4890 5.344 100.0 3.8750 4 277 18.6 396.90
## 211 0.17446 0.0 10.59 1 0.4890 5.960 92.1 3.8771 4 277 18.6 393.25
## 212 0.37578 0.0 10.59 1 0.4890 5.404 88.6 3.6650 4 277 18.6 395.24
## 213 0.21719 0.0 10.59 1 0.4890 5.807 53.8 3.6526 4 277 18.6 390.94
## 214 0.14052 0.0 10.59 0 0.4890 6.375 32.3 3.9454 4 277 18.6 385.81
## 215 0.28955 0.0 10.59 0 0.4890 5.412 9.8 3.5875 4 277 18.6 348.93
## 216 0.19802 0.0 10.59 0 0.4890 6.182 42.4 3.9454 4 277 18.6 393.63
## 217 0.04560 0.0 13.89 1 0.5500 5.888 56.0 3.1121 5 276 16.4 392.80
## 218 0.07013 0.0 13.89 0 0.5500 6.642 85.1 3.4211 5 276 16.4 392.78
## 219 0.11069 0.0 13.89 1 0.5500 5.951 93.8 2.8893 5 276 16.4 396.90
## 220 0.11425 0.0 13.89 1 0.5500 6.373 92.4 3.3633 5 276 16.4 393.74
## 221 0.35809 0.0 6.20 1 0.5070 6.951 88.5 2.8617 8 307 17.4 391.70
## 222 0.40771 0.0 6.20 1 0.5070 6.164 91.3 3.0480 8 307 17.4 395.24
## 223 0.62356 0.0 6.20 1 0.5070 6.879 77.7 3.2721 8 307 17.4 390.39
## 224 0.61470 0.0 6.20 0 0.5070 6.618 80.8 3.2721 8 307 17.4 396.90
## 225 0.31533 0.0 6.20 0 0.5040 8.266 78.3 2.8944 8 307 17.4 385.05
## 226 0.52693 0.0 6.20 0 0.5040 8.725 83.0 2.8944 8 307 17.4 382.00
## 227 0.38214 0.0 6.20 0 0.5040 8.040 86.5 3.2157 8 307 17.4 387.38
## 228 0.41238 0.0 6.20 0 0.5040 7.163 79.9 3.2157 8 307 17.4 372.08
## 229 0.29819 0.0 6.20 0 0.5040 7.686 17.0 3.3751 8 307 17.4 377.51
## 230 0.44178 0.0 6.20 0 0.5040 6.552 21.4 3.3751 8 307 17.4 380.34
## 231 0.53700 0.0 6.20 0 0.5040 5.981 68.1 3.6715 8 307 17.4 378.35
## 232 0.46296 0.0 6.20 0 0.5040 7.412 76.9 3.6715 8 307 17.4 376.14
## 233 0.57529 0.0 6.20 0 0.5070 8.337 73.3 3.8384 8 307 17.4 385.91
## 234 0.33147 0.0 6.20 0 0.5070 8.247 70.4 3.6519 8 307 17.4 378.95
## 235 0.44791 0.0 6.20 1 0.5070 6.726 66.5 3.6519 8 307 17.4 360.20
## 236 0.33045 0.0 6.20 0 0.5070 6.086 61.5 3.6519 8 307 17.4 376.75
## 237 0.52058 0.0 6.20 1 0.5070 6.631 76.5 4.1480 8 307 17.4 388.45
## 238 0.51183 0.0 6.20 0 0.5070 7.358 71.6 4.1480 8 307 17.4 390.07
## 239 0.08244 30.0 4.93 0 0.4280 6.481 18.5 6.1899 6 300 16.6 379.41
## 240 0.09252 30.0 4.93 0 0.4280 6.606 42.2 6.1899 6 300 16.6 383.78
## 241 0.11329 30.0 4.93 0 0.4280 6.897 54.3 6.3361 6 300 16.6 391.25
## 242 0.10612 30.0 4.93 0 0.4280 6.095 65.1 6.3361 6 300 16.6 394.62
## 243 0.10290 30.0 4.93 0 0.4280 6.358 52.9 7.0355 6 300 16.6 372.75
## 244 0.12757 30.0 4.93 0 0.4280 6.393 7.8 7.0355 6 300 16.6 374.71
## 245 0.20608 22.0 5.86 0 0.4310 5.593 76.5 7.9549 7 330 19.1 372.49
## 246 0.19133 22.0 5.86 0 0.4310 5.605 70.2 7.9549 7 330 19.1 389.13
## 247 0.33983 22.0 5.86 0 0.4310 6.108 34.9 8.0555 7 330 19.1 390.18
## 248 0.19657 22.0 5.86 0 0.4310 6.226 79.2 8.0555 7 330 19.1 376.14
## 249 0.16439 22.0 5.86 0 0.4310 6.433 49.1 7.8265 7 330 19.1 374.71
## 250 0.19073 22.0 5.86 0 0.4310 6.718 17.5 7.8265 7 330 19.1 393.74
## 251 0.14030 22.0 5.86 0 0.4310 6.487 13.0 7.3967 7 330 19.1 396.28
## 252 0.21409 22.0 5.86 0 0.4310 6.438 8.9 7.3967 7 330 19.1 377.07
## 253 0.08221 22.0 5.86 0 0.4310 6.957 6.8 8.9067 7 330 19.1 386.09
## 254 0.36894 22.0 5.86 0 0.4310 8.259 8.4 8.9067 7 330 19.1 396.90
## 255 0.04819 80.0 3.64 0 0.3920 6.108 32.0 9.2203 1 315 16.4 392.89
## 256 0.03548 80.0 3.64 0 0.3920 5.876 19.1 9.2203 1 315 16.4 395.18
## 257 0.01538 90.0 3.75 0 0.3940 7.454 34.2 6.3361 3 244 15.9 386.34
## 258 0.61154 20.0 3.97 0 0.6470 8.704 86.9 1.8010 5 264 13.0 389.70
## 259 0.66351 20.0 3.97 0 0.6470 7.333 100.0 1.8946 5 264 13.0 383.29
## 260 0.65665 20.0 3.97 0 0.6470 6.842 100.0 2.0107 5 264 13.0 391.93
## 261 0.54011 20.0 3.97 0 0.6470 7.203 81.8 2.1121 5 264 13.0 392.80
## 262 0.53412 20.0 3.97 0 0.6470 7.520 89.4 2.1398 5 264 13.0 388.37
## 263 0.52014 20.0 3.97 0 0.6470 8.398 91.5 2.2885 5 264 13.0 386.86
## 264 0.82526 20.0 3.97 0 0.6470 7.327 94.5 2.0788 5 264 13.0 393.42
## 265 0.55007 20.0 3.97 0 0.6470 7.206 91.6 1.9301 5 264 13.0 387.89
## 266 0.76162 20.0 3.97 0 0.6470 5.560 62.8 1.9865 5 264 13.0 392.40
## 267 0.78570 20.0 3.97 0 0.6470 7.014 84.6 2.1329 5 264 13.0 384.07
## 268 0.57834 20.0 3.97 0 0.5750 8.297 67.0 2.4216 5 264 13.0 384.54
## 269 0.54050 20.0 3.97 0 0.5750 7.470 52.6 2.8720 5 264 13.0 390.30
## 270 0.09065 20.0 6.96 1 0.4640 5.920 61.5 3.9175 3 223 18.6 391.34
## 271 0.29916 20.0 6.96 0 0.4640 5.856 42.1 4.4290 3 223 18.6 388.65
## 272 0.16211 20.0 6.96 0 0.4640 6.240 16.3 4.4290 3 223 18.6 396.90
## 273 0.11460 20.0 6.96 0 0.4640 6.538 58.7 3.9175 3 223 18.6 394.96
## 274 0.22188 20.0 6.96 1 0.4640 7.691 51.8 4.3665 3 223 18.6 390.77
## 275 0.05644 40.0 6.41 1 0.4470 6.758 32.9 4.0776 4 254 17.6 396.90
## 276 0.09604 40.0 6.41 0 0.4470 6.854 42.8 4.2673 4 254 17.6 396.90
## 277 0.10469 40.0 6.41 1 0.4470 7.267 49.0 4.7872 4 254 17.6 389.25
## 278 0.06127 40.0 6.41 1 0.4470 6.826 27.6 4.8628 4 254 17.6 393.45
## 279 0.07978 40.0 6.41 0 0.4470 6.482 32.1 4.1403 4 254 17.6 396.90
## 280 0.21038 20.0 3.33 0 0.4429 6.812 32.2 4.1007 5 216 14.9 396.90
## 281 0.03578 20.0 3.33 0 0.4429 7.820 64.5 4.6947 5 216 14.9 387.31
## 282 0.03705 20.0 3.33 0 0.4429 6.968 37.2 5.2447 5 216 14.9 392.23
## 283 0.06129 20.0 3.33 1 0.4429 7.645 49.7 5.2119 5 216 14.9 377.07
## 284 0.01501 90.0 1.21 1 0.4010 7.923 24.8 5.8850 1 198 13.6 395.52
## 285 0.00906 90.0 2.97 0 0.4000 7.088 20.8 7.3073 1 285 15.3 394.72
## 286 0.01096 55.0 2.25 0 0.3890 6.453 31.9 7.3073 1 300 15.3 394.72
## 287 0.01965 80.0 1.76 0 0.3850 6.230 31.5 9.0892 1 241 18.2 341.60
## 288 0.03871 52.5 5.32 0 0.4050 6.209 31.3 7.3172 6 293 16.6 396.90
## 289 0.04590 52.5 5.32 0 0.4050 6.315 45.6 7.3172 6 293 16.6 396.90
## 290 0.04297 52.5 5.32 0 0.4050 6.565 22.9 7.3172 6 293 16.6 371.72
## 291 0.03502 80.0 4.95 0 0.4110 6.861 27.9 5.1167 4 245 19.2 396.90
## 292 0.07886 80.0 4.95 0 0.4110 7.148 27.7 5.1167 4 245 19.2 396.90
## 293 0.03615 80.0 4.95 0 0.4110 6.630 23.4 5.1167 4 245 19.2 396.90
## 294 0.08265 0.0 13.92 0 0.4370 6.127 18.4 5.5027 4 289 16.0 396.90
## 295 0.08199 0.0 13.92 0 0.4370 6.009 42.3 5.5027 4 289 16.0 396.90
## 296 0.12932 0.0 13.92 0 0.4370 6.678 31.1 5.9604 4 289 16.0 396.90
## 297 0.05372 0.0 13.92 0 0.4370 6.549 51.0 5.9604 4 289 16.0 392.85
## 298 0.14103 0.0 13.92 0 0.4370 5.790 58.0 6.3200 4 289 16.0 396.90
## 299 0.06466 70.0 2.24 0 0.4000 6.345 20.1 7.8278 5 358 14.8 368.24
## 300 0.05561 70.0 2.24 0 0.4000 7.041 10.0 7.8278 5 358 14.8 371.58
## 301 0.04417 70.0 2.24 0 0.4000 6.871 47.4 7.8278 5 358 14.8 390.86
## 302 0.03537 34.0 6.09 0 0.4330 6.590 40.4 5.4917 7 329 16.1 395.75
## 303 0.09266 34.0 6.09 0 0.4330 6.495 18.4 5.4917 7 329 16.1 383.61
## 304 0.10000 34.0 6.09 0 0.4330 6.982 17.7 5.4917 7 329 16.1 390.43
## 305 0.05515 33.0 2.18 0 0.4720 7.236 41.1 4.0220 7 222 18.4 393.68
## 306 0.05479 33.0 2.18 0 0.4720 6.616 58.1 3.3700 7 222 18.4 393.36
## 307 0.07503 33.0 2.18 0 0.4720 7.420 71.9 3.0992 7 222 18.4 396.90
## 308 0.04932 33.0 2.18 0 0.4720 6.849 70.3 3.1827 7 222 18.4 396.90
## 309 0.49298 0.0 9.90 0 0.5440 6.635 82.5 3.3175 4 304 18.4 396.90
## 310 0.34940 0.0 9.90 0 0.5440 5.972 76.7 3.1025 4 304 18.4 396.24
## 311 2.63548 0.0 9.90 0 0.5440 4.973 37.8 2.5194 4 304 18.4 350.45
## 312 0.79041 0.0 9.90 0 0.5440 6.122 52.8 2.6403 4 304 18.4 396.90
## 313 0.26169 0.0 9.90 0 0.5440 6.023 90.4 2.8340 4 304 18.4 396.30
## 314 0.26938 0.0 9.90 0 0.5440 6.266 82.8 3.2628 4 304 18.4 393.39
## 315 0.36920 0.0 9.90 0 0.5440 6.567 87.3 3.6023 4 304 18.4 395.69
## 316 0.25356 0.0 9.90 0 0.5440 5.705 77.7 3.9450 4 304 18.4 396.42
## 317 0.31827 0.0 9.90 0 0.5440 5.914 83.2 3.9986 4 304 18.4 390.70
## 318 0.24522 0.0 9.90 0 0.5440 5.782 71.7 4.0317 4 304 18.4 396.90
## 319 0.40202 0.0 9.90 0 0.5440 6.382 67.2 3.5325 4 304 18.4 395.21
## 320 0.47547 0.0 9.90 0 0.5440 6.113 58.8 4.0019 4 304 18.4 396.23
## 321 0.16760 0.0 7.38 0 0.4930 6.426 52.3 4.5404 5 287 19.6 396.90
## 322 0.18159 0.0 7.38 0 0.4930 6.376 54.3 4.5404 5 287 19.6 396.90
## 323 0.35114 0.0 7.38 0 0.4930 6.041 49.9 4.7211 5 287 19.6 396.90
## 324 0.28392 0.0 7.38 0 0.4930 5.708 74.3 4.7211 5 287 19.6 391.13
## 325 0.34109 0.0 7.38 0 0.4930 6.415 40.1 4.7211 5 287 19.6 396.90
## 326 0.19186 0.0 7.38 0 0.4930 6.431 14.7 5.4159 5 287 19.6 393.68
## 327 0.30347 0.0 7.38 0 0.4930 6.312 28.9 5.4159 5 287 19.6 396.90
## 328 0.24103 0.0 7.38 0 0.4930 6.083 43.7 5.4159 5 287 19.6 396.90
## 329 0.06617 0.0 3.24 0 0.4600 5.868 25.8 5.2146 4 430 16.9 382.44
## 330 0.06724 0.0 3.24 0 0.4600 6.333 17.2 5.2146 4 430 16.9 375.21
## 331 0.04544 0.0 3.24 0 0.4600 6.144 32.2 5.8736 4 430 16.9 368.57
## 332 0.05023 35.0 6.06 0 0.4379 5.706 28.4 6.6407 1 304 16.9 394.02
## 333 0.03466 35.0 6.06 0 0.4379 6.031 23.3 6.6407 1 304 16.9 362.25
## 334 0.05083 0.0 5.19 0 0.5150 6.316 38.1 6.4584 5 224 20.2 389.71
## 335 0.03738 0.0 5.19 0 0.5150 6.310 38.5 6.4584 5 224 20.2 389.40
## 336 0.03961 0.0 5.19 0 0.5150 6.037 34.5 5.9853 5 224 20.2 396.90
## 337 0.03427 0.0 5.19 0 0.5150 5.869 46.3 5.2311 5 224 20.2 396.90
## 338 0.03041 0.0 5.19 0 0.5150 5.895 59.6 5.6150 5 224 20.2 394.81
## 339 0.03306 0.0 5.19 0 0.5150 6.059 37.3 4.8122 5 224 20.2 396.14
## 340 0.05497 0.0 5.19 0 0.5150 5.985 45.4 4.8122 5 224 20.2 396.90
## 341 0.06151 0.0 5.19 0 0.5150 5.968 58.5 4.8122 5 224 20.2 396.90
## 342 0.01301 35.0 1.52 0 0.4420 7.241 49.3 7.0379 1 284 15.5 394.74
## 343 0.02498 0.0 1.89 0 0.5180 6.540 59.7 6.2669 1 422 15.9 389.96
## 344 0.02543 55.0 3.78 0 0.4840 6.696 56.4 5.7321 5 370 17.6 396.90
## 345 0.03049 55.0 3.78 0 0.4840 6.874 28.1 6.4654 5 370 17.6 387.97
## 346 0.03113 0.0 4.39 0 0.4420 6.014 48.5 8.0136 3 352 18.8 385.64
## 347 0.06162 0.0 4.39 0 0.4420 5.898 52.3 8.0136 3 352 18.8 364.61
## 348 0.01870 85.0 4.15 0 0.4290 6.516 27.7 8.5353 4 351 17.9 392.43
## 349 0.01501 80.0 2.01 0 0.4350 6.635 29.7 8.3440 4 280 17.0 390.94
## 350 0.02899 40.0 1.25 0 0.4290 6.939 34.5 8.7921 1 335 19.7 389.85
## 351 0.06211 40.0 1.25 0 0.4290 6.490 44.4 8.7921 1 335 19.7 396.90
## 352 0.07950 60.0 1.69 0 0.4110 6.579 35.9 10.7103 4 411 18.3 370.78
## 353 0.07244 60.0 1.69 0 0.4110 5.884 18.5 10.7103 4 411 18.3 392.33
## 354 0.01709 90.0 2.02 0 0.4100 6.728 36.1 12.1265 5 187 17.0 384.46
## 355 0.04301 80.0 1.91 0 0.4130 5.663 21.9 10.5857 4 334 22.0 382.80
## 356 0.10659 80.0 1.91 0 0.4130 5.936 19.5 10.5857 4 334 22.0 376.04
## 357 8.98296 0.0 18.10 1 0.7700 6.212 97.4 2.1222 24 666 20.2 377.73
## 358 3.84970 0.0 18.10 1 0.7700 6.395 91.0 2.5052 24 666 20.2 391.34
## 359 5.20177 0.0 18.10 1 0.7700 6.127 83.4 2.7227 24 666 20.2 395.43
## 360 4.26131 0.0 18.10 0 0.7700 6.112 81.3 2.5091 24 666 20.2 390.74
## 361 4.54192 0.0 18.10 0 0.7700 6.398 88.0 2.5182 24 666 20.2 374.56
## 362 3.83684 0.0 18.10 0 0.7700 6.251 91.1 2.2955 24 666 20.2 350.65
## 363 3.67822 0.0 18.10 0 0.7700 5.362 96.2 2.1036 24 666 20.2 380.79
## 364 4.22239 0.0 18.10 1 0.7700 5.803 89.0 1.9047 24 666 20.2 353.04
## 365 3.47428 0.0 18.10 1 0.7180 8.780 82.9 1.9047 24 666 20.2 354.55
## 366 4.55587 0.0 18.10 0 0.7180 3.561 87.9 1.6132 24 666 20.2 354.70
## 367 3.69695 0.0 18.10 0 0.7180 4.963 91.4 1.7523 24 666 20.2 316.03
## 368 13.52220 0.0 18.10 0 0.6310 3.863 100.0 1.5106 24 666 20.2 131.42
## 369 4.89822 0.0 18.10 0 0.6310 4.970 100.0 1.3325 24 666 20.2 375.52
## 370 5.66998 0.0 18.10 1 0.6310 6.683 96.8 1.3567 24 666 20.2 375.33
## 371 6.53876 0.0 18.10 1 0.6310 7.016 97.5 1.2024 24 666 20.2 392.05
## 372 9.23230 0.0 18.10 0 0.6310 6.216 100.0 1.1691 24 666 20.2 366.15
## 373 8.26725 0.0 18.10 1 0.6680 5.875 89.6 1.1296 24 666 20.2 347.88
## 374 11.10810 0.0 18.10 0 0.6680 4.906 100.0 1.1742 24 666 20.2 396.90
## 375 18.49820 0.0 18.10 0 0.6680 4.138 100.0 1.1370 24 666 20.2 396.90
## 376 19.60910 0.0 18.10 0 0.6710 7.313 97.9 1.3163 24 666 20.2 396.90
## 377 15.28800 0.0 18.10 0 0.6710 6.649 93.3 1.3449 24 666 20.2 363.02
## 378 9.82349 0.0 18.10 0 0.6710 6.794 98.8 1.3580 24 666 20.2 396.90
## 379 23.64820 0.0 18.10 0 0.6710 6.380 96.2 1.3861 24 666 20.2 396.90
## 380 17.86670 0.0 18.10 0 0.6710 6.223 100.0 1.3861 24 666 20.2 393.74
## 381 88.97620 0.0 18.10 0 0.6710 6.968 91.9 1.4165 24 666 20.2 396.90
## 382 15.87440 0.0 18.10 0 0.6710 6.545 99.1 1.5192 24 666 20.2 396.90
## 383 9.18702 0.0 18.10 0 0.7000 5.536 100.0 1.5804 24 666 20.2 396.90
## 384 7.99248 0.0 18.10 0 0.7000 5.520 100.0 1.5331 24 666 20.2 396.90
## 385 20.08490 0.0 18.10 0 0.7000 4.368 91.2 1.4395 24 666 20.2 285.83
## 386 16.81180 0.0 18.10 0 0.7000 5.277 98.1 1.4261 24 666 20.2 396.90
## 387 24.39380 0.0 18.10 0 0.7000 4.652 100.0 1.4672 24 666 20.2 396.90
## 388 22.59710 0.0 18.10 0 0.7000 5.000 89.5 1.5184 24 666 20.2 396.90
## 389 14.33370 0.0 18.10 0 0.7000 4.880 100.0 1.5895 24 666 20.2 372.92
## 390 8.15174 0.0 18.10 0 0.7000 5.390 98.9 1.7281 24 666 20.2 396.90
## 391 6.96215 0.0 18.10 0 0.7000 5.713 97.0 1.9265 24 666 20.2 394.43
## 392 5.29305 0.0 18.10 0 0.7000 6.051 82.5 2.1678 24 666 20.2 378.38
## 393 11.57790 0.0 18.10 0 0.7000 5.036 97.0 1.7700 24 666 20.2 396.90
## 394 8.64476 0.0 18.10 0 0.6930 6.193 92.6 1.7912 24 666 20.2 396.90
## 395 13.35980 0.0 18.10 0 0.6930 5.887 94.7 1.7821 24 666 20.2 396.90
## 396 8.71675 0.0 18.10 0 0.6930 6.471 98.8 1.7257 24 666 20.2 391.98
## 397 5.87205 0.0 18.10 0 0.6930 6.405 96.0 1.6768 24 666 20.2 396.90
## 398 7.67202 0.0 18.10 0 0.6930 5.747 98.9 1.6334 24 666 20.2 393.10
## 399 38.35180 0.0 18.10 0 0.6930 5.453 100.0 1.4896 24 666 20.2 396.90
## 400 9.91655 0.0 18.10 0 0.6930 5.852 77.8 1.5004 24 666 20.2 338.16
## 401 25.04610 0.0 18.10 0 0.6930 5.987 100.0 1.5888 24 666 20.2 396.90
## 402 14.23620 0.0 18.10 0 0.6930 6.343 100.0 1.5741 24 666 20.2 396.90
## 403 9.59571 0.0 18.10 0 0.6930 6.404 100.0 1.6390 24 666 20.2 376.11
## 404 24.80170 0.0 18.10 0 0.6930 5.349 96.0 1.7028 24 666 20.2 396.90
## 405 41.52920 0.0 18.10 0 0.6930 5.531 85.4 1.6074 24 666 20.2 329.46
## 406 67.92080 0.0 18.10 0 0.6930 5.683 100.0 1.4254 24 666 20.2 384.97
## 407 20.71620 0.0 18.10 0 0.6590 4.138 100.0 1.1781 24 666 20.2 370.22
## 408 11.95110 0.0 18.10 0 0.6590 5.608 100.0 1.2852 24 666 20.2 332.09
## 409 7.40389 0.0 18.10 0 0.5970 5.617 97.9 1.4547 24 666 20.2 314.64
## 410 14.43830 0.0 18.10 0 0.5970 6.852 100.0 1.4655 24 666 20.2 179.36
## 411 51.13580 0.0 18.10 0 0.5970 5.757 100.0 1.4130 24 666 20.2 2.60
## 412 14.05070 0.0 18.10 0 0.5970 6.657 100.0 1.5275 24 666 20.2 35.05
## 413 18.81100 0.0 18.10 0 0.5970 4.628 100.0 1.5539 24 666 20.2 28.79
## 414 28.65580 0.0 18.10 0 0.5970 5.155 100.0 1.5894 24 666 20.2 210.97
## 415 45.74610 0.0 18.10 0 0.6930 4.519 100.0 1.6582 24 666 20.2 88.27
## 416 18.08460 0.0 18.10 0 0.6790 6.434 100.0 1.8347 24 666 20.2 27.25
## 417 10.83420 0.0 18.10 0 0.6790 6.782 90.8 1.8195 24 666 20.2 21.57
## 418 25.94060 0.0 18.10 0 0.6790 5.304 89.1 1.6475 24 666 20.2 127.36
## 419 73.53410 0.0 18.10 0 0.6790 5.957 100.0 1.8026 24 666 20.2 16.45
## 420 11.81230 0.0 18.10 0 0.7180 6.824 76.5 1.7940 24 666 20.2 48.45
## 421 11.08740 0.0 18.10 0 0.7180 6.411 100.0 1.8589 24 666 20.2 318.75
## 422 7.02259 0.0 18.10 0 0.7180 6.006 95.3 1.8746 24 666 20.2 319.98
## 423 12.04820 0.0 18.10 0 0.6140 5.648 87.6 1.9512 24 666 20.2 291.55
## 424 7.05042 0.0 18.10 0 0.6140 6.103 85.1 2.0218 24 666 20.2 2.52
## 425 8.79212 0.0 18.10 0 0.5840 5.565 70.6 2.0635 24 666 20.2 3.65
## 426 15.86030 0.0 18.10 0 0.6790 5.896 95.4 1.9096 24 666 20.2 7.68
## 427 12.24720 0.0 18.10 0 0.5840 5.837 59.7 1.9976 24 666 20.2 24.65
## 428 37.66190 0.0 18.10 0 0.6790 6.202 78.7 1.8629 24 666 20.2 18.82
## 429 7.36711 0.0 18.10 0 0.6790 6.193 78.1 1.9356 24 666 20.2 96.73
## 430 9.33889 0.0 18.10 0 0.6790 6.380 95.6 1.9682 24 666 20.2 60.72
## 431 8.49213 0.0 18.10 0 0.5840 6.348 86.1 2.0527 24 666 20.2 83.45
## 432 10.06230 0.0 18.10 0 0.5840 6.833 94.3 2.0882 24 666 20.2 81.33
## 433 6.44405 0.0 18.10 0 0.5840 6.425 74.8 2.2004 24 666 20.2 97.95
## 434 5.58107 0.0 18.10 0 0.7130 6.436 87.9 2.3158 24 666 20.2 100.19
## 435 13.91340 0.0 18.10 0 0.7130 6.208 95.0 2.2222 24 666 20.2 100.63
## 436 11.16040 0.0 18.10 0 0.7400 6.629 94.6 2.1247 24 666 20.2 109.85
## 437 14.42080 0.0 18.10 0 0.7400 6.461 93.3 2.0026 24 666 20.2 27.49
## 438 15.17720 0.0 18.10 0 0.7400 6.152 100.0 1.9142 24 666 20.2 9.32
## 439 13.67810 0.0 18.10 0 0.7400 5.935 87.9 1.8206 24 666 20.2 68.95
## 440 9.39063 0.0 18.10 0 0.7400 5.627 93.9 1.8172 24 666 20.2 396.90
## 441 22.05110 0.0 18.10 0 0.7400 5.818 92.4 1.8662 24 666 20.2 391.45
## 442 9.72418 0.0 18.10 0 0.7400 6.406 97.2 2.0651 24 666 20.2 385.96
## 443 5.66637 0.0 18.10 0 0.7400 6.219 100.0 2.0048 24 666 20.2 395.69
## 444 9.96654 0.0 18.10 0 0.7400 6.485 100.0 1.9784 24 666 20.2 386.73
## 445 12.80230 0.0 18.10 0 0.7400 5.854 96.6 1.8956 24 666 20.2 240.52
## 446 10.67180 0.0 18.10 0 0.7400 6.459 94.8 1.9879 24 666 20.2 43.06
## 447 6.28807 0.0 18.10 0 0.7400 6.341 96.4 2.0720 24 666 20.2 318.01
## 448 9.92485 0.0 18.10 0 0.7400 6.251 96.6 2.1980 24 666 20.2 388.52
## 449 9.32909 0.0 18.10 0 0.7130 6.185 98.7 2.2616 24 666 20.2 396.90
## 450 7.52601 0.0 18.10 0 0.7130 6.417 98.3 2.1850 24 666 20.2 304.21
## 451 6.71772 0.0 18.10 0 0.7130 6.749 92.6 2.3236 24 666 20.2 0.32
## 452 5.44114 0.0 18.10 0 0.7130 6.655 98.2 2.3552 24 666 20.2 355.29
## 453 5.09017 0.0 18.10 0 0.7130 6.297 91.8 2.3682 24 666 20.2 385.09
## 454 8.24809 0.0 18.10 0 0.7130 7.393 99.3 2.4527 24 666 20.2 375.87
## 455 9.51363 0.0 18.10 0 0.7130 6.728 94.1 2.4961 24 666 20.2 6.68
## 456 4.75237 0.0 18.10 0 0.7130 6.525 86.5 2.4358 24 666 20.2 50.92
## 457 4.66883 0.0 18.10 0 0.7130 5.976 87.9 2.5806 24 666 20.2 10.48
## 458 8.20058 0.0 18.10 0 0.7130 5.936 80.3 2.7792 24 666 20.2 3.50
## 459 7.75223 0.0 18.10 0 0.7130 6.301 83.7 2.7831 24 666 20.2 272.21
## 460 6.80117 0.0 18.10 0 0.7130 6.081 84.4 2.7175 24 666 20.2 396.90
## 461 4.81213 0.0 18.10 0 0.7130 6.701 90.0 2.5975 24 666 20.2 255.23
## 462 3.69311 0.0 18.10 0 0.7130 6.376 88.4 2.5671 24 666 20.2 391.43
## 463 6.65492 0.0 18.10 0 0.7130 6.317 83.0 2.7344 24 666 20.2 396.90
## 464 5.82115 0.0 18.10 0 0.7130 6.513 89.9 2.8016 24 666 20.2 393.82
## 465 7.83932 0.0 18.10 0 0.6550 6.209 65.4 2.9634 24 666 20.2 396.90
## 466 3.16360 0.0 18.10 0 0.6550 5.759 48.2 3.0665 24 666 20.2 334.40
## 467 3.77498 0.0 18.10 0 0.6550 5.952 84.7 2.8715 24 666 20.2 22.01
## 468 4.42228 0.0 18.10 0 0.5840 6.003 94.5 2.5403 24 666 20.2 331.29
## 469 15.57570 0.0 18.10 0 0.5800 5.926 71.0 2.9084 24 666 20.2 368.74
## 470 13.07510 0.0 18.10 0 0.5800 5.713 56.7 2.8237 24 666 20.2 396.90
## 471 4.34879 0.0 18.10 0 0.5800 6.167 84.0 3.0334 24 666 20.2 396.90
## 472 4.03841 0.0 18.10 0 0.5320 6.229 90.7 3.0993 24 666 20.2 395.33
## 473 3.56868 0.0 18.10 0 0.5800 6.437 75.0 2.8965 24 666 20.2 393.37
## 474 4.64689 0.0 18.10 0 0.6140 6.980 67.6 2.5329 24 666 20.2 374.68
## 475 8.05579 0.0 18.10 0 0.5840 5.427 95.4 2.4298 24 666 20.2 352.58
## 476 6.39312 0.0 18.10 0 0.5840 6.162 97.4 2.2060 24 666 20.2 302.76
## 477 4.87141 0.0 18.10 0 0.6140 6.484 93.6 2.3053 24 666 20.2 396.21
## 478 15.02340 0.0 18.10 0 0.6140 5.304 97.3 2.1007 24 666 20.2 349.48
## 479 10.23300 0.0 18.10 0 0.6140 6.185 96.7 2.1705 24 666 20.2 379.70
## 480 14.33370 0.0 18.10 0 0.6140 6.229 88.0 1.9512 24 666 20.2 383.32
## 481 5.82401 0.0 18.10 0 0.5320 6.242 64.7 3.4242 24 666 20.2 396.90
## 482 5.70818 0.0 18.10 0 0.5320 6.750 74.9 3.3317 24 666 20.2 393.07
## 483 5.73116 0.0 18.10 0 0.5320 7.061 77.0 3.4106 24 666 20.2 395.28
## 484 2.81838 0.0 18.10 0 0.5320 5.762 40.3 4.0983 24 666 20.2 392.92
## 485 2.37857 0.0 18.10 0 0.5830 5.871 41.9 3.7240 24 666 20.2 370.73
## 486 3.67367 0.0 18.10 0 0.5830 6.312 51.9 3.9917 24 666 20.2 388.62
## 487 5.69175 0.0 18.10 0 0.5830 6.114 79.8 3.5459 24 666 20.2 392.68
## 488 4.83567 0.0 18.10 0 0.5830 5.905 53.2 3.1523 24 666 20.2 388.22
## 489 0.15086 0.0 27.74 0 0.6090 5.454 92.7 1.8209 4 711 20.1 395.09
## 490 0.18337 0.0 27.74 0 0.6090 5.414 98.3 1.7554 4 711 20.1 344.05
## 491 0.20746 0.0 27.74 0 0.6090 5.093 98.0 1.8226 4 711 20.1 318.43
## 492 0.10574 0.0 27.74 0 0.6090 5.983 98.8 1.8681 4 711 20.1 390.11
## 493 0.11132 0.0 27.74 0 0.6090 5.983 83.5 2.1099 4 711 20.1 396.90
## 494 0.17331 0.0 9.69 0 0.5850 5.707 54.0 2.3817 6 391 19.2 396.90
## 495 0.27957 0.0 9.69 0 0.5850 5.926 42.6 2.3817 6 391 19.2 396.90
## 496 0.17899 0.0 9.69 0 0.5850 5.670 28.8 2.7986 6 391 19.2 393.29
## 497 0.28960 0.0 9.69 0 0.5850 5.390 72.9 2.7986 6 391 19.2 396.90
## 498 0.26838 0.0 9.69 0 0.5850 5.794 70.6 2.8927 6 391 19.2 396.90
## 499 0.23912 0.0 9.69 0 0.5850 6.019 65.3 2.4091 6 391 19.2 396.90
## 500 0.17783 0.0 9.69 0 0.5850 5.569 73.5 2.3999 6 391 19.2 395.77
## 501 0.22438 0.0 9.69 0 0.5850 6.027 79.7 2.4982 6 391 19.2 396.90
## 502 0.06263 0.0 11.93 0 0.5730 6.593 69.1 2.4786 1 273 21.0 391.99
## 503 0.04527 0.0 11.93 0 0.5730 6.120 76.7 2.2875 1 273 21.0 396.90
## 504 0.06076 0.0 11.93 0 0.5730 6.976 91.0 2.1675 1 273 21.0 396.90
## 505 0.10959 0.0 11.93 0 0.5730 6.794 89.3 2.3889 1 273 21.0 393.45
## 506 0.04741 0.0 11.93 0 0.5730 6.030 80.8 2.5050 1 273 21.0 396.90
## lstat medv
## 1 4.98 24.0
## 2 9.14 21.6
## 3 4.03 34.7
## 4 2.94 33.4
## 5 5.33 36.2
## 6 5.21 28.7
## 7 12.43 22.9
## 8 19.15 27.1
## 9 29.93 16.5
## 10 17.10 18.9
## 11 20.45 15.0
## 12 13.27 18.9
## 13 15.71 21.7
## 14 8.26 20.4
## 15 10.26 18.2
## 16 8.47 19.9
## 17 6.58 23.1
## 18 14.67 17.5
## 19 11.69 20.2
## 20 11.28 18.2
## 21 21.02 13.6
## 22 13.83 19.6
## 23 18.72 15.2
## 24 19.88 14.5
## 25 16.30 15.6
## 26 16.51 13.9
## 27 14.81 16.6
## 28 17.28 14.8
## 29 12.80 18.4
## 30 11.98 21.0
## 31 22.60 12.7
## 32 13.04 14.5
## 33 27.71 13.2
## 34 18.35 13.1
## 35 20.34 13.5
## 36 9.68 18.9
## 37 11.41 20.0
## 38 8.77 21.0
## 39 10.13 24.7
## 40 4.32 30.8
## 41 1.98 34.9
## 42 4.84 26.6
## 43 5.81 25.3
## 44 7.44 24.7
## 45 9.55 21.2
## 46 10.21 19.3
## 47 14.15 20.0
## 48 18.80 16.6
## 49 30.81 14.4
## 50 16.20 19.4
## 51 13.45 19.7
## 52 9.43 20.5
## 53 5.28 25.0
## 54 8.43 23.4
## 55 14.80 18.9
## 56 4.81 35.4
## 57 5.77 24.7
## 58 3.95 31.6
## 59 6.86 23.3
## 60 9.22 19.6
## 61 13.15 18.7
## 62 14.44 16.0
## 63 6.73 22.2
## 64 9.50 25.0
## 65 8.05 33.0
## 66 4.67 23.5
## 67 10.24 19.4
## 68 8.10 22.0
## 69 13.09 17.4
## 70 8.79 20.9
## 71 6.72 24.2
## 72 9.88 21.7
## 73 5.52 22.8
## 74 7.54 23.4
## 75 6.78 24.1
## 76 8.94 21.4
## 77 11.97 20.0
## 78 10.27 20.8
## 79 12.34 21.2
## 80 9.10 20.3
## 81 5.29 28.0
## 82 7.22 23.9
## 83 6.72 24.8
## 84 7.51 22.9
## 85 9.62 23.9
## 86 6.53 26.6
## 87 12.86 22.5
## 88 8.44 22.2
## 89 5.50 23.6
## 90 5.70 28.7
## 91 8.81 22.6
## 92 8.20 22.0
## 93 8.16 22.9
## 94 6.21 25.0
## 95 10.59 20.6
## 96 6.65 28.4
## 97 11.34 21.4
## 98 4.21 38.7
## 99 3.57 43.8
## 100 6.19 33.2
## 101 9.42 27.5
## 102 7.67 26.5
## 103 10.63 18.6
## 104 13.44 19.3
## 105 12.33 20.1
## 106 16.47 19.5
## 107 18.66 19.5
## 108 14.09 20.4
## 109 12.27 19.8
## 110 15.55 19.4
## 111 13.00 21.7
## 112 10.16 22.8
## 113 16.21 18.8
## 114 17.09 18.7
## 115 10.45 18.5
## 116 15.76 18.3
## 117 12.04 21.2
## 118 10.30 19.2
## 119 15.37 20.4
## 120 13.61 19.3
## 121 14.37 22.0
## 122 14.27 20.3
## 123 17.93 20.5
## 124 25.41 17.3
## 125 17.58 18.8
## 126 14.81 21.4
## 127 27.26 15.7
## 128 17.19 16.2
## 129 15.39 18.0
## 130 18.34 14.3
## 131 12.60 19.2
## 132 12.26 19.6
## 133 11.12 23.0
## 134 15.03 18.4
## 135 17.31 15.6
## 136 16.96 18.1
## 137 16.90 17.4
## 138 14.59 17.1
## 139 21.32 13.3
## 140 18.46 17.8
## 141 24.16 14.0
## 142 34.41 14.4
## 143 26.82 13.4
## 144 26.42 15.6
## 145 29.29 11.8
## 146 27.80 13.8
## 147 16.65 15.6
## 148 29.53 14.6
## 149 28.32 17.8
## 150 21.45 15.4
## 151 14.10 21.5
## 152 13.28 19.6
## 153 12.12 15.3
## 154 15.79 19.4
## 155 15.12 17.0
## 156 15.02 15.6
## 157 16.14 13.1
## 158 4.59 41.3
## 159 6.43 24.3
## 160 7.39 23.3
## 161 5.50 27.0
## 162 1.73 50.0
## 163 1.92 50.0
## 164 3.32 50.0
## 165 11.64 22.7
## 166 9.81 25.0
## 167 3.70 50.0
## 168 12.14 23.8
## 169 11.10 23.8
## 170 11.32 22.3
## 171 14.43 17.4
## 172 12.03 19.1
## 173 14.69 23.1
## 174 9.04 23.6
## 175 9.64 22.6
## 176 5.33 29.4
## 177 10.11 23.2
## 178 6.29 24.6
## 179 6.92 29.9
## 180 5.04 37.2
## 181 7.56 39.8
## 182 9.45 36.2
## 183 4.82 37.9
## 184 5.68 32.5
## 185 13.98 26.4
## 186 13.15 29.6
## 187 4.45 50.0
## 188 6.68 32.0
## 189 4.56 29.8
## 190 5.39 34.9
## 191 5.10 37.0
## 192 4.69 30.5
## 193 2.87 36.4
## 194 5.03 31.1
## 195 4.38 29.1
## 196 2.97 50.0
## 197 4.08 33.3
## 198 8.61 30.3
## 199 6.62 34.6
## 200 4.56 34.9
## 201 4.45 32.9
## 202 7.43 24.1
## 203 3.11 42.3
## 204 3.81 48.5
## 205 2.88 50.0
## 206 10.87 22.6
## 207 10.97 24.4
## 208 18.06 22.5
## 209 14.66 24.4
## 210 23.09 20.0
## 211 17.27 21.7
## 212 23.98 19.3
## 213 16.03 22.4
## 214 9.38 28.1
## 215 29.55 23.7
## 216 9.47 25.0
## 217 13.51 23.3
## 218 9.69 28.7
## 219 17.92 21.5
## 220 10.50 23.0
## 221 9.71 26.7
## 222 21.46 21.7
## 223 9.93 27.5
## 224 7.60 30.1
## 225 4.14 44.8
## 226 4.63 50.0
## 227 3.13 37.6
## 228 6.36 31.6
## 229 3.92 46.7
## 230 3.76 31.5
## 231 11.65 24.3
## 232 5.25 31.7
## 233 2.47 41.7
## 234 3.95 48.3
## 235 8.05 29.0
## 236 10.88 24.0
## 237 9.54 25.1
## 238 4.73 31.5
## 239 6.36 23.7
## 240 7.37 23.3
## 241 11.38 22.0
## 242 12.40 20.1
## 243 11.22 22.2
## 244 5.19 23.7
## 245 12.50 17.6
## 246 18.46 18.5
## 247 9.16 24.3
## 248 10.15 20.5
## 249 9.52 24.5
## 250 6.56 26.2
## 251 5.90 24.4
## 252 3.59 24.8
## 253 3.53 29.6
## 254 3.54 42.8
## 255 6.57 21.9
## 256 9.25 20.9
## 257 3.11 44.0
## 258 5.12 50.0
## 259 7.79 36.0
## 260 6.90 30.1
## 261 9.59 33.8
## 262 7.26 43.1
## 263 5.91 48.8
## 264 11.25 31.0
## 265 8.10 36.5
## 266 10.45 22.8
## 267 14.79 30.7
## 268 7.44 50.0
## 269 3.16 43.5
## 270 13.65 20.7
## 271 13.00 21.1
## 272 6.59 25.2
## 273 7.73 24.4
## 274 6.58 35.2
## 275 3.53 32.4
## 276 2.98 32.0
## 277 6.05 33.2
## 278 4.16 33.1
## 279 7.19 29.1
## 280 4.85 35.1
## 281 3.76 45.4
## 282 4.59 35.4
## 283 3.01 46.0
## 284 3.16 50.0
## 285 7.85 32.2
## 286 8.23 22.0
## 287 12.93 20.1
## 288 7.14 23.2
## 289 7.60 22.3
## 290 9.51 24.8
## 291 3.33 28.5
## 292 3.56 37.3
## 293 4.70 27.9
## 294 8.58 23.9
## 295 10.40 21.7
## 296 6.27 28.6
## 297 7.39 27.1
## 298 15.84 20.3
## 299 4.97 22.5
## 300 4.74 29.0
## 301 6.07 24.8
## 302 9.50 22.0
## 303 8.67 26.4
## 304 4.86 33.1
## 305 6.93 36.1
## 306 8.93 28.4
## 307 6.47 33.4
## 308 7.53 28.2
## 309 4.54 22.8
## 310 9.97 20.3
## 311 12.64 16.1
## 312 5.98 22.1
## 313 11.72 19.4
## 314 7.90 21.6
## 315 9.28 23.8
## 316 11.50 16.2
## 317 18.33 17.8
## 318 15.94 19.8
## 319 10.36 23.1
## 320 12.73 21.0
## 321 7.20 23.8
## 322 6.87 23.1
## 323 7.70 20.4
## 324 11.74 18.5
## 325 6.12 25.0
## 326 5.08 24.6
## 327 6.15 23.0
## 328 12.79 22.2
## 329 9.97 19.3
## 330 7.34 22.6
## 331 9.09 19.8
## 332 12.43 17.1
## 333 7.83 19.4
## 334 5.68 22.2
## 335 6.75 20.7
## 336 8.01 21.1
## 337 9.80 19.5
## 338 10.56 18.5
## 339 8.51 20.6
## 340 9.74 19.0
## 341 9.29 18.7
## 342 5.49 32.7
## 343 8.65 16.5
## 344 7.18 23.9
## 345 4.61 31.2
## 346 10.53 17.5
## 347 12.67 17.2
## 348 6.36 23.1
## 349 5.99 24.5
## 350 5.89 26.6
## 351 5.98 22.9
## 352 5.49 24.1
## 353 7.79 18.6
## 354 4.50 30.1
## 355 8.05 18.2
## 356 5.57 20.6
## 357 17.60 17.8
## 358 13.27 21.7
## 359 11.48 22.7
## 360 12.67 22.6
## 361 7.79 25.0
## 362 14.19 19.9
## 363 10.19 20.8
## 364 14.64 16.8
## 365 5.29 21.9
## 366 7.12 27.5
## 367 14.00 21.9
## 368 13.33 23.1
## 369 3.26 50.0
## 370 3.73 50.0
## 371 2.96 50.0
## 372 9.53 50.0
## 373 8.88 50.0
## 374 34.77 13.8
## 375 37.97 13.8
## 376 13.44 15.0
## 377 23.24 13.9
## 378 21.24 13.3
## 379 23.69 13.1
## 380 21.78 10.2
## 381 17.21 10.4
## 382 21.08 10.9
## 383 23.60 11.3
## 384 24.56 12.3
## 385 30.63 8.8
## 386 30.81 7.2
## 387 28.28 10.5
## 388 31.99 7.4
## 389 30.62 10.2
## 390 20.85 11.5
## 391 17.11 15.1
## 392 18.76 23.2
## 393 25.68 9.7
## 394 15.17 13.8
## 395 16.35 12.7
## 396 17.12 13.1
## 397 19.37 12.5
## 398 19.92 8.5
## 399 30.59 5.0
## 400 29.97 6.3
## 401 26.77 5.6
## 402 20.32 7.2
## 403 20.31 12.1
## 404 19.77 8.3
## 405 27.38 8.5
## 406 22.98 5.0
## 407 23.34 11.9
## 408 12.13 27.9
## 409 26.40 17.2
## 410 19.78 27.5
## 411 10.11 15.0
## 412 21.22 17.2
## 413 34.37 17.9
## 414 20.08 16.3
## 415 36.98 7.0
## 416 29.05 7.2
## 417 25.79 7.5
## 418 26.64 10.4
## 419 20.62 8.8
## 420 22.74 8.4
## 421 15.02 16.7
## 422 15.70 14.2
## 423 14.10 20.8
## 424 23.29 13.4
## 425 17.16 11.7
## 426 24.39 8.3
## 427 15.69 10.2
## 428 14.52 10.9
## 429 21.52 11.0
## 430 24.08 9.5
## 431 17.64 14.5
## 432 19.69 14.1
## 433 12.03 16.1
## 434 16.22 14.3
## 435 15.17 11.7
## 436 23.27 13.4
## 437 18.05 9.6
## 438 26.45 8.7
## 439 34.02 8.4
## 440 22.88 12.8
## 441 22.11 10.5
## 442 19.52 17.1
## 443 16.59 18.4
## 444 18.85 15.4
## 445 23.79 10.8
## 446 23.98 11.8
## 447 17.79 14.9
## 448 16.44 12.6
## 449 18.13 14.1
## 450 19.31 13.0
## 451 17.44 13.4
## 452 17.73 15.2
## 453 17.27 16.1
## 454 16.74 17.8
## 455 18.71 14.9
## 456 18.13 14.1
## 457 19.01 12.7
## 458 16.94 13.5
## 459 16.23 14.9
## 460 14.70 20.0
## 461 16.42 16.4
## 462 14.65 17.7
## 463 13.99 19.5
## 464 10.29 20.2
## 465 13.22 21.4
## 466 14.13 19.9
## 467 17.15 19.0
## 468 21.32 19.1
## 469 18.13 19.1
## 470 14.76 20.1
## 471 16.29 19.9
## 472 12.87 19.6
## 473 14.36 23.2
## 474 11.66 29.8
## 475 18.14 13.8
## 476 24.10 13.3
## 477 18.68 16.7
## 478 24.91 12.0
## 479 18.03 14.6
## 480 13.11 21.4
## 481 10.74 23.0
## 482 7.74 23.7
## 483 7.01 25.0
## 484 10.42 21.8
## 485 13.34 20.6
## 486 10.58 21.2
## 487 14.98 19.1
## 488 11.45 20.6
## 489 18.06 15.2
## 490 23.97 7.0
## 491 29.68 8.1
## 492 18.07 13.6
## 493 13.35 20.1
## 494 12.01 21.8
## 495 13.59 24.5
## 496 17.60 23.1
## 497 21.14 19.7
## 498 14.10 18.3
## 499 12.92 21.2
## 500 15.10 17.5
## 501 14.33 16.8
## 502 9.67 22.4
## 503 9.08 20.6
## 504 5.64 23.9
## 505 6.48 22.0
## 506 7.88 11.9
set.seed(1)
train = sample(1:nrow(Boston), nrow(Boston)/2)
boston.test=Boston[-train ,"medv"]
mtry_ <- 6:15
ntree_ <- c(25, 50, 100, 150, 500)
vect_error <- vector()
vect_cate <- vector()
k = 1
for (i in mtry_){
for (j in ntree_){
rf.boston =randomForest(medv∼.,data=Boston ,subset =train ,
mtry=i, importance =TRUE, ntree = j)
yhat.rf = predict(rf.boston , newdata=Boston[-train,])
vect_error[k] = mean((yhat.rf -boston.test)^2)
vect_cate[k] = str_c(i, j, sep='-')
k = k + 1
}
}## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
grup = c(rep('g1', 5), rep('g2', 5), rep('g3', 5), rep('g4', 5), rep('g5', 5), rep('g6', 5),
rep('g7', 5), rep('g8', 5), rep('g9', 5), rep('g10', 5))
df_error = data.frame(vect_cate, vect_error, rep(1:5, 10), grup)
names(df_error) = c('tree_mtry', 'y', 'eje_x', 'grup')
ggplot(df_error, aes(x = eje_x, y=y, group=grup, colour=grup)) +
geom_line() +
geom_point( size=2, shape=21, fill="white") +
theme_minimal()Split the data set into a training set and a test set
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
Fit a regression tree to the training set. Plot the tree, and interpret the results. What test MSE do you obtain?
## Registered S3 method overwritten by 'tree':
## method from
## print.tree cli
##
## Regression tree:
## tree(formula = Sales ~ ., data = dt_train)
## Variables actually used in tree construction:
## [1] "ShelveLoc" "Price" "Income" "CompPrice" "Advertising"
## [6] "Age"
## Number of terminal nodes: 18
## Residual mean deviance: 2.523 = 764.4 / 303
## Distribution of residuals:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -3.90000 -1.06200 -0.01417 0.00000 1.04800 4.22600
### C
Use cross-validation in order to determine the optimal level of tree complexity. Does pruning the tree improve the test MSE?
### D
Use the bagging approach in order to analyze this data. What test MSE do you obtain? Use the importance() function to determine which variables are most important.
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
##
## Call:
## randomForest(formula = Sales ~ ., data = dt_train, mtry = 13, importance = TRUE)
## Type of random forest: regression
## Number of trees: 500
## No. of variables tried at each split: 10
##
## Mean of squared residuals: 2.45914
## % Var explained: 68.9
## MSE TEST
## [1] 2.246559
## %IncMSE IncNodePurity
## CompPrice 33.522788 256.384174
## Income 7.762270 122.801187
## Advertising 21.076091 164.771460
## Population -1.409105 82.251739
## Price 69.404219 732.183579
## ShelveLoc 81.003894 793.424922
## Age 25.304258 242.349739
## Education 3.707298 73.960587
## Urban 1.087029 10.579006
## US 2.860203 9.007516
###E
Use random forests to analyze this data. What test MSE do you obtain? Use the importance() function to determine which variables aremost important. Describe the effect of m, the number of variables considered at each split, on the error rate obtained.
rf.sale =randomForest(Sales∼.,data=dt_train, mtry=6, importance =TRUE)
yhat.rf = predict (rf.sale ,newdata=df_test[,-1])
cat('MSE TEST \n')## MSE TEST
## [1] 2.293773
## %IncMSE IncNodePurity
## CompPrice 24.904328 241.40874
## Income 7.464867 147.36056
## Advertising 18.623228 189.05533
## Population -1.691185 98.50981
## Price 54.701286 680.50888
## ShelveLoc 66.659911 736.87907
## Age 23.716570 259.61046
## Education 3.398162 82.70636
## Urban -2.500238 13.53313
## US 4.075596 19.04079
This problem involves the OJ data set which is part of the ISLR package.
Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations.
Fit a tree to the training data, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics about the tree, and describe the results obtained. What is the training error rate? How many terminal nodes does the tree have?
##
## Classification tree:
## tree(formula = Purchase ~ ., data = train_OJ)
## Variables actually used in tree construction:
## [1] "LoyalCH" "PriceDiff"
## Number of terminal nodes: 9
## Residual mean deviance: 0.7273 = 575.3 / 791
## Misclassification error rate: 0.1575 = 126 / 800
Type in the name of the tree object in order to get a detailed text output. Pick one of the terminal nodes, and interpret the information displayed.
## node), split, n, deviance, yval, (yprob)
## * denotes terminal node
##
## 1) root 800 1076.000 CH ( 0.60125 0.39875 )
## 2) LoyalCH < 0.5036 355 414.400 MM ( 0.27042 0.72958 )
## 4) LoyalCH < 0.279374 174 124.100 MM ( 0.11494 0.88506 )
## 8) LoyalCH < 0.0356415 62 10.240 MM ( 0.01613 0.98387 ) *
## 9) LoyalCH > 0.0356415 112 102.000 MM ( 0.16964 0.83036 ) *
## 5) LoyalCH > 0.279374 181 246.300 MM ( 0.41989 0.58011 )
## 10) PriceDiff < 0.05 72 70.930 MM ( 0.19444 0.80556 ) *
## 11) PriceDiff > 0.05 109 149.000 CH ( 0.56881 0.43119 ) *
## 3) LoyalCH > 0.5036 445 352.000 CH ( 0.86517 0.13483 )
## 6) PriceDiff < -0.39 28 35.160 MM ( 0.32143 0.67857 )
## 12) LoyalCH < 0.974803 21 17.220 MM ( 0.14286 0.85714 ) *
## 13) LoyalCH > 0.974803 7 5.742 CH ( 0.85714 0.14286 ) *
## 7) PriceDiff > -0.39 417 268.000 CH ( 0.90168 0.09832 )
## 14) LoyalCH < 0.705326 128 134.500 CH ( 0.78125 0.21875 )
## 28) PriceDiff < 0.265 69 90.350 CH ( 0.63768 0.36232 ) *
## 29) PriceDiff > 0.265 59 23.720 CH ( 0.94915 0.05085 ) *
## 15) LoyalCH > 0.705326 289 106.000 CH ( 0.95502 0.04498 ) *
La primera selección se hace con la varible LoyalCH la cual representa la fidelización de marca del cliente como esta variable tiene valores entre 0,1 dependiendo del valor que tome se hace una clasificación a más MM y CH.
Predict the response on the test data, and produce a confusion matrix comparing the test labels to the predicted test labels. What is the test error rate?
##
## tree.pred CH MM
## CH 154 33
## MM 18 65
## Tasa clasificación correct:
## [1] 0.8111111
Apply the cv.tree() function to the training set in order to determine the optimal tree size.
## [1] "size" "dev" "k" "method"
## $size
## [1] 9 6 5 3 2 1
##
## $dev
## [1] 139 139 146 156 157 319
##
## $k
## [1] -Inf 0.0 5.0 7.5 10.0 163.0
##
## $method
## [1] "misclass"
##
## attr(,"class")
## [1] "prune" "tree.sequence"
Produce a plot with tree size on the x-axis and cross-validated classification error rate on the y-axis.
Which tree size corresponds to the lowest cross-validated classification error rate?
El tamaño del arbol que da una mejor tasa de clasificación en con seis nodos.
Produce a pruned tree corresponding to the optimal tree size obtained using cross-validation. If cross-validation does not lead to selection of a pruned tree, then create a pruned tree with five terminal nodes.
Compare the training error rates between the pruned and unpruned trees. Which is higher?
##
## tree.pred CH MM
## CH 154 33
## MM 18 65
## Tasa clasificación correct:
## [1] 0.8111111
La tasa de clasificación no vario ya que con seis nodos se llega al maxima tasa de calsificación por tanto no hay diferencia entre utilizar nueve o seis nodos.
We now use boosting to predict Salary in the Hitters data set.
Remove the observations for whom the salary information is unknown, and then log-transform the salaries.
Create a training set consisting of the first 200 observations, and a test set consisting of the remaining observations.
Perform boosting on the training set with 1,000 trees for a range of values of the shrinkage parameter λ. Produce a plot with different shrinkage values on the x-axis and the corresponding training set MSE on the y-axis.
## Loaded gbm 2.1.8
lmda <- c(0.1, 0.2, 0.3, 0.4, 0.5)
vect_error <- vector()
k <- 1
for (i in lmda) {
boost.sale =gbm(log_sales∼.,train, distribution="gaussian",
n.trees =1000 , interaction.depth =4, shrinkage = i , verbose =F)
vect_error[k] <- sum(boost.sale$train.error)/length(train)
k = k + 1
}
plot(lmda, vect_error, type="b")Produce a plot with different shrinkage values on the x-axis and the corresponding test set MSE on the y-axis.
lmda <- c(0.1, 0.2, 0.3, 0.4, 0.5)
vect_error <- vector()
k <- 1
for (i in lmda) {
boost.sales =gbm(log_sales∼.,train, distribution="gaussian",
n.trees =1000 , interaction.depth =4, shrinkage = i , verbose =F)
yhat.boost=predict (boost.sales ,newdata = test[, -length(test)], n.trees =1000)
vect_error[k] <- sum((yhat.boost-test[, length(test)])^2)/length(test)
k = k + 1
}
plot(lmda, vect_error, type="b")Which variables appear to be the most important predictors in the boosted model?
boost.sales =gbm(log_sales∼.,train, distribution="gaussian",
n.trees =1000 , interaction.depth =4, shrinkage = 0.2 , verbose =F)
summary(boost.sales)## var rel.inf
## Price Price 33.0269938
## CompPrice CompPrice 18.0842664
## ShelveLoc ShelveLoc 11.2072020
## Population Population 10.4450670
## Income Income 9.5264010
## Age Age 8.5999683
## Advertising Advertising 4.2177659
## Education Education 3.5275449
## US US 0.8530239
## Urban Urban 0.5117668
Now apply bagging to the training set. What is the test set MSE for this approach?
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
##
## Call:
## randomForest(formula = log_sales ~ ., data = train, mtry = 13, importance = TRUE)
## Type of random forest: regression
## Number of trees: 500
## No. of variables tried at each split: 10
##
## Mean of squared residuals: 0.1777817
## % Var explained: 46.42
This question uses the Caravan data set.
Create a training set consisting of the first 1,000 observations, and a test set consisting of the remaining observations.
Fit a boosting model to the training set with Purchase as the response and the other variables as predictors. Use 1,000 trees, and a shrinkage value of 0.01. Which predictors appear to be the most important?
library(caret)
mod_11 <- gbm(Purchase~., data=cr_train, shrinkage=0.1, distribution = 'bernoulli', n.trees=1000, verbose=F)
#best.iter = gbm.perf(mod_11, method="cv")
summary(mod_11)## var rel.inf
## MINK3045 MINK3045 8.39479272
## MBERMIDD MBERMIDD 7.72016096
## PBYSTAND PBYSTAND 7.51821041
## MOSTYPE MOSTYPE 5.57126048
## MSKC MSKC 5.06909281
## MSKB1 MSKB1 5.00428283
## MGODGE MGODGE 4.35834765
## PPERSAUT PPERSAUT 3.91700847
## MKOOPKLA MKOOPKLA 3.43841388
## MOPLMIDD MOPLMIDD 3.38202295
## MOPLHOOG MOPLHOOG 3.35756831
## PBRAND PBRAND 3.15489915
## MFGEKIND MFGEKIND 2.57315068
## MBERARBG MBERARBG 2.16176441
## MBERHOOG MBERHOOG 2.15526570
## MINKGEM MINKGEM 1.77697778
## MBERZELF MBERZELF 1.60730204
## MINK7512 MINK7512 1.58602671
## APERSAUT APERSAUT 1.54079309
## MSKA MSKA 1.50888225
## PAANHANG PAANHANG 1.40864422
## MOPLLAAG MOPLLAAG 1.36741999
## PLEVEN PLEVEN 1.30270766
## MFWEKIND MFWEKIND 1.28464116
## MINKM30 MINKM30 1.22017731
## PFIETS PFIETS 1.20555874
## MINK4575 MINK4575 1.13879147
## MZPART MZPART 1.13828021
## MAUT1 MAUT1 1.10181701
## PWAPART PWAPART 1.01119861
## MGODRK MGODRK 0.96165858
## MSKB2 MSKB2 0.95031002
## MRELGE MRELGE 0.86058102
## MHHUUR MHHUUR 0.70049840
## MHKOOP MHKOOP 0.69119951
## MGODPR MGODPR 0.67839635
## MAUT0 MAUT0 0.67006776
## MBERARBO MBERARBO 0.60427847
## MGODOV MGODOV 0.53803527
## MSKD MSKD 0.51877707
## MAUT2 MAUT2 0.48268161
## ALEVEN ALEVEN 0.46570641
## MINK123M MINK123M 0.43136697
## MFALLEEN MFALLEEN 0.43131381
## AFIETS AFIETS 0.42202325
## MRELOV MRELOV 0.41310740
## MZFONDS MZFONDS 0.40806918
## MGEMOMV MGEMOMV 0.37311645
## MGEMLEEF MGEMLEEF 0.29853809
## MAANTHUI MAANTHUI 0.24660372
## MOSHOOFD MOSHOOFD 0.20658351
## PMOTSCO PMOTSCO 0.20393976
## PGEZONG PGEZONG 0.11370028
## MRELSA MRELSA 0.11320187
## ABRAND ABRAND 0.09334856
## MBERBOER MBERBOER 0.09141374
## PWABEDR PWABEDR 0.05602326
## PWALAND PWALAND 0.00000000
## PBESAUT PBESAUT 0.00000000
## PVRAAUT PVRAAUT 0.00000000
## PTRACTOR PTRACTOR 0.00000000
## PWERKT PWERKT 0.00000000
## PBROM PBROM 0.00000000
## PPERSONG PPERSONG 0.00000000
## PWAOREG PWAOREG 0.00000000
## PZEILPL PZEILPL 0.00000000
## PPLEZIER PPLEZIER 0.00000000
## PINBOED PINBOED 0.00000000
## AWAPART AWAPART 0.00000000
## AWABEDR AWABEDR 0.00000000
## AWALAND AWALAND 0.00000000
## ABESAUT ABESAUT 0.00000000
## AMOTSCO AMOTSCO 0.00000000
## AVRAAUT AVRAAUT 0.00000000
## AAANHANG AAANHANG 0.00000000
## ATRACTOR ATRACTOR 0.00000000
## AWERKT AWERKT 0.00000000
## ABROM ABROM 0.00000000
## APERSONG APERSONG 0.00000000
## AGEZONG AGEZONG 0.00000000
## AWAOREG AWAOREG 0.00000000
## AZEILPL AZEILPL 0.00000000
## APLEZIER APLEZIER 0.00000000
## AINBOED AINBOED 0.00000000
## ABYSTAND ABYSTAND 0.00000000
Apply boosting, bagging, and random forests to a data set of your choice. Be sure to fit the models on a training set and to evaluate their performance on a test set. How accurate are the results compared to simple methods like linear or logistic regression? Which of these approaches yields the best performance?
# Base de mtcars
# Predicción de millas por galon
#bagging
mod_mpg <- randomForest(mpg ~., data=mtcars, mtry=13, importance =TRUE)## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
##
## Call:
## randomForest(formula = mpg ~ ., data = mtcars, mtry = 13, importance = TRUE)
## Type of random forest: regression
## Number of trees: 500
## No. of variables tried at each split: 10
##
## Mean of squared residuals: 5.628086
## % Var explained: 84.01
## Warning in randomForest.default(m, y, ...): invalid mtry: reset to within valid
## range
##
## Call:
## randomForest(formula = mpg ~ ., data = mtcars, mtry = 13, ntree = 100, importance = TRUE)
## Type of random forest: regression
## Number of trees: 100
## No. of variables tried at each split: 10
##
## Mean of squared residuals: 5.604802
## % Var explained: 84.07
##
## Call:
## lm(formula = mpg ~ ., data = mtcars)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.4506 -1.6044 -0.1196 1.2193 4.6271
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 12.30337 18.71788 0.657 0.5181
## cyl -0.11144 1.04502 -0.107 0.9161
## disp 0.01334 0.01786 0.747 0.4635
## hp -0.02148 0.02177 -0.987 0.3350
## drat 0.78711 1.63537 0.481 0.6353
## wt -3.71530 1.89441 -1.961 0.0633 .
## qsec 0.82104 0.73084 1.123 0.2739
## vs 0.31776 2.10451 0.151 0.8814
## am 2.52023 2.05665 1.225 0.2340
## gear 0.65541 1.49326 0.439 0.6652
## carb -0.19942 0.82875 -0.241 0.8122
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.65 on 21 degrees of freedom
## Multiple R-squared: 0.869, Adjusted R-squared: 0.8066
## F-statistic: 13.93 on 10 and 21 DF, p-value: 3.793e-07
El compratmiento en los tres modelos es muy parecido en cuanto al RMSE, diferien muy poco peude que mejore en lo modelos de bagging y rendomForest dependiendo de la poda del arbol y el número de arboles construidos es decir buscar el optimo de los parámetros con una malla.
Generate a simulated two-class data set with 100 observations and two features in which there is a visible but non-linear separation between the two classes. Show that in this setting, a support vector machine with a polynomial kernel (with degree greater than 1) or a radial kernel will outperform a support vector classifier on the training data. Which technique performs best on the test data? Make plots and report training and test error rates in order to back up your assertions.
set.seed (131019)
x <- matrix(rnorm (200*2) , ncol =2)
x[1:100 ,] <- x[1:100 ,]+3
x[101:150 ,] <- x[101:150 ,] -3
y <- c(rep (1,150) ,rep (2,50))
dat <- data.frame(x=x,y=as.factor(y))train <- sample(200,100)
svmfitPol <- svm(y∼., data=dat[train,], kernel ='polynomial', cost=1, degree=2)
plot(svmfitPol , dat[train,])mc_svm <- table(predict_svm, test$y)
mc_svm %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| 1 | 2 |
|---|---|
| 67 | 2 |
| 13 | 18 |
cl_bsmv <- (mc_svm[1,1]+mc_svm[2,2])/length(test$y)
cl_msvm <- (mc_svm[1,2]+mc_svm[2,1])/length(test$y)
kable(cl_bsmv,col.names = "Buena Clasificacion")| Buena Clasificacion |
|---|
| 0.85 |
| Mala Clasificacion |
|---|
| 0.15 |
train <- sample(200,100)
svmfitRad <- svm(y∼., data=dat[train,], kernel ='radial', cost=1)
plot(svmfitRad , dat[train,])mc_svm2 <- table(predict_svm2, test$y)
mc_svm2 %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| 1 | 2 |
|---|---|
| 77 | 1 |
| 0 | 22 |
| Buena Clasificación |
|---|
| 0.99 |
| Mala Clasificación |
|---|
| 0.01 |
We have seen that we can fit an SVM with a non-linear kernel in order to perform
classification using a non-linear decision boundary.We will now see that we can also obtain a non-linear decision boundary by performing logistic
regression using non-linear transformations of the features.
Literales:
Generate a data set with n = 500 and p = 2, such that the observations belong to two classes with a quadratic decision boundary between them. For instance, you can do this as follows:
Creamos unos datos con n = 500 y p = 2, tal que manera que las observaciones pertenezcan a dos clases con una frontera de decisión cuadrática entre ellas.
Plot the observations, colored according to their class labels. Your plot should display X1 on the x-axis, and X2 on the yaxis.
plot(x1[y == 0], x2[y == 0], col = "red", xlab = "X1", ylab = "X2", las=1,pch = 16,las=1)
points(x1[y == 1], x2[y == 1], col = "blue",pch=16)Fit a logistic regression model to the data, using X1 and X2 as predictors
Ajustamos un modelo de regresión logistica con covariables x1 y x2
Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be linear.
set.seed(131019)
x1 <- runif(500) - 0.5
x2 <- runif(500) - 0.5
y <- as.integer(x1 ^ 2 - x2 ^ 2 > 0)
datos <- data.frame(x1 = x1, x2 = x2, y = as.factor(y))
lr.fit <- glm(y ~ ., data = datos, family = 'binomial')
lr.prob <- predict(lr.fit, newdata = datos, type = 'response')
lr.prob[lr.prob >.5]<-1
lr.prob[lr.prob <.5]<-0
data <- data.frame(datos,lr.prob)
data$lr.prob <- as.factor(data$lr.prob)
plot(data$x1,data$x2, col=data$lr.prob, xlab = "X1", ylab = "X2", las=1,pch=16)mc_glmsvm <- table(lr.prob, datos$y)
mc_glmsvm %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| 0 | 1 | |
|---|---|---|
| 0 | 166 | 165 |
| 1 | 91 | 78 |
clasificacion_glmsvm1 <- (mc_glmsvm[1,1]+mc_glmsvm[2,2])/(mc_glmsvm[1,1]+mc_glmsvm[1,2]+mc_glmsvm[2,1]+mc_glmsvm[2,2])
clasificacion_glmsvm <- (mc_glmsvm[1,2]+mc_glmsvm[2,1])/(mc_glmsvm[1,1]+mc_glmsvm[1,2]+mc_glmsvm[2,1]+mc_glmsvm[2,2])
kable(clasificacion_glmsvm1,col.names = "Buena Clasificación")| Buena Clasificación |
|---|
| 0.488 |
| Mala Clasificación |
|---|
| 0.512 |
Modelo Ajustado {.tabset}
Now fit a logistic regression model to the data using non-linear functions of X1 and X2 as predictors (e.g. X2 1 , X1×X2, log(X2), and so forth).
Ahora se ajusta un modelo de regresión logística a los datos usando funciones no lineales de X1 y X2 como predictores.
Apply this model to the training data in order to obtain a predicted class label for each training observation. Plot the observations, colored according to the predicted class labels. The decision boundary should be obviously non-linear. If it is not, then repeat (a)-(e) until you come up with an example in which the predicted class labels are obviously non-linear.
lr.prob2 <- predict(lr.fit2, newdata = dat2, type = 'response')
lr.prob2[lr.prob2 >.5]<-1
lr.prob2[lr.prob2 <.5]<-0
data2 <- data.frame(dat2,lr.prob2)
data2$lr.prob2 <- as.factor(data2$lr.prob2)
plot(data2$x1,data2$x2, col=data2$lr.prob2, xlab = "X1", ylab = "X2", las=1,pch=16)mc_glmsvm2 <- table(lr.prob2, dat2$y)
mc_glmsvm2 %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| 0 | 1 | |
|---|---|---|
| 0 | 261 | 0 |
| 1 | 0 | 239 |
clasificacion_glmsvm23 <- (mc_glmsvm2[1,1]+mc_glmsvm2[2,2])/(mc_glmsvm2[1,1]+mc_glmsvm2[1,2]+mc_glmsvm2[2,1]+mc_glmsvm2[2,2])
clasificacion_glmsvm2 <- (mc_glmsvm2[1,2]+mc_glmsvm2[2,1])/(mc_glmsvm2[1,1]+mc_glmsvm2[1,2]+mc_glmsvm2[2,1]+mc_glmsvm2[2,2])
kable(clasificacion_glmsvm23,col.names = "Buena Clasificación")| Buena Clasificación |
|---|
| 1 |
| Mala Clasificación |
|---|
| 0 |
Fit a support vector classifier to the data with X1 and X2 as predictors. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
Fit a SVM using a non-linear kernel to the data. Obtain a class prediction for each training observation. Plot the observations, colored according to the predicted class labels.
svm_prob <- predict(svm_fit, newdata = data_2, type = 'response')
svm_prob[svm_prob >.5]<-1
svm_prob[svm_prob <.5]<-0
datta_2 <- data.frame(data_2,svm_prob)
datta_2$svm_prob <- as.factor(datta_2$svm_prob)
plot(datta_2$x1,datta_2$x2, col=datta_2$svm_prob, xlab="X1",ylab="X2",las=1,pch=16)Comment on your results.
mc_svml <- table(svm.prob, datos$y)
mc_svml %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| 0 | 1 | |
|---|---|---|
| 0 | 139 | 150 |
| 1 | 118 | 93 |
clasificacion_svm1 <- (mc_svml[1,1]+mc_svml[2,2])/length(datta$y)
clasificacion_svm <- (mc_svml[1,2]+mc_svml[2,1])/length(datta$y)
kable(clasificacion_svm1, col.names = "Clasificación buena")| Clasificación buena |
|---|
| 0.464 |
| clasificación mala |
|---|
| 0.536 |
mc_svmr <- table(svm_prob, datos$y)
mc_svmr %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| 0 | 1 | |
|---|---|---|
| 0 | 137 | 127 |
| 1 | 120 | 116 |
clasificacion_svmr1 <- (mc_svmr[1,1]+mc_svmr[2,2])/length(datta_2$y)
clasificacion_svmr <- (mc_svmr[1,2]+mc_svmr[2,1])/length(datta_2$y)
kable(clasificacion_svmr1, col.names = "Clasificación buena")| Clasificación buena |
|---|
| 0.506 |
| clasificación mala |
|---|
| 0.494 |
At the end of Section 9.6.1, it is claimed that in the case of data that is just barely linearly separable, a support vector classifier with a small value of cost that misclassifies a couple of training observations may perform better on test data than one with a huge value of cost that does not misclassify any training observations. You will now investigate this claim.
Literales:
Generate two-class data with p = 2 in such a way that the classes are just barely linearly separable.
Simulamos datos con dos clases, p = 2 de tal manera que las clases son apenas lineales separables.
Compute the cross-validation error rates for support vector classifiers with a range of cost values. How many training errors are misclassified for each value of cost considered, and how does this relate to the cross-validation errors obtained?
Calcularemos las tasas de error de validación cruzada sobre los vectores de clasificacion de apoyo con un rango de costos.
set.seed (1)
tune.out <- tune(svm, y∼.,data=daticos ,kernel="linear",
ranges = list(cost=c(0.001 , 0.01, 0.1, 1,5,10,100)))Obtengamos un summary del modelo
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 0.01
##
## - best performance: 0.232
##
## - Detailed performance results:
## cost error dispersion
## 1 1e-03 0.377 0.03335000
## 2 1e-02 0.232 0.04467164
## 3 1e-01 0.239 0.04557046
## 4 1e+00 0.238 0.04467164
## 5 5e+00 0.238 0.04467164
## 6 1e+01 0.238 0.04467164
## 7 1e+02 0.238 0.04467164
##
## Call:
## best.tune(method = svm, train.x = y ~ ., data = daticos, ranges = list(cost = c(0.001,
## 0.01, 0.1, 1, 5, 10, 100)), kernel = "linear")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 639
##
## ( 319 320 )
##
##
## Number of Classes: 2
##
## Levels:
## 0 1
Generate an appropriate test data set, and compute the test errors corresponding to each of the values of cost considered. Which value of cost leads to the fewest test errors, and how does this compare to the values of cost that yield the fewest training errors and the fewest cross-validation errors?
## [1] 0.3533333 0.2200000 0.2333333 0.2333333 0.2333333 0.2333333 0.2333333
kable(paste('El cost', cost.grid[which.min(err.rate.test)], 'tiene el minimo error de prueba de:', min(err.rate.test)), col.names = "Resultado")| Resultado |
|---|
| El cost 0.01 tiene el minimo error de prueba de: 0.22 |
Discuss your results.
In this problem, you will use support vector approaches in order to predict whether a given car gets high or low gas mileage based on the Auto data set.
Literales:
Create a binary variable that takes on a 1 for cars with gas mileage above the median, and a 0 for cars with gas mileage below the median.
En este problema, se utilizarán enfoques de vectores de soporte para predecir si un coche determinado tiene un alto o bajo kilometraje basado la base de datos Auto
Descripción de los datos:
Un marco de datos con 392 observaciones sobre las siguientes 9 variables.
Los datos tienen 408 observaciones y se eliminaron 16 observaciones con NA’s.
Crearemos una variable binaria que tome el valor de uno si el valor de mpg es mayor a su mediana y cero si es menor.
| Mediana |
|---|
| 22.75 |
Fit a support vector classifier to the data with various values of cost, in order to predict whether a car gets high or low gas mileage. Report the cross-validation errors associated with different values of this parameter. Comment on your results.
Calculemos las tasas de error de validación cruzada de los clasificadores de vectores de soporte con el rango de valores para costo.
tune.auto <- tune(svm, median_mpg ~ ., data = bd_auto, kernel = 'linear', ranges = list(cost=c(0.001 , 0.01, 0.1, 1,5,10,100)))Ahora veamos el resumen del modelo
##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1
##
## - best performance: 0.0936813
##
## - Detailed performance results:
## cost error dispersion
## 1 1e-03 0.10857094 0.02384915
## 2 1e-02 0.10413206 0.03237924
## 3 1e-01 0.10170970 0.03407414
## 4 1e+00 0.09368130 0.02210164
## 5 5e+00 0.09789524 0.02593135
## 6 1e+01 0.10338815 0.02768520
## 7 1e+02 0.12046961 0.03412864
Now repeat (b), this time using SVMs with radial and polynomial basis kernels, with different values of gamma and degree and cost. Comment on your results.
Se repetira el literal b, esta vez usando SVM con núcleos de base radial y polinómica, con diferentes valores de gamma y grado y costo.
Modelo con kernel radial
tune.auto2 <- tune(svm, median_mpg ~ ., data = bd_auto, kernel = "radial", ranges = list(cost=c(0.001 , 0.01, 0.1, 1,5,10,100),gamma=c(0.5,1,2,3,4)))##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma
## 1 0.5
##
## - best performance: 0.06403344
##
## - Detailed performance results:
## cost gamma error dispersion
## 1 1e-03 0.5 0.47919628 0.090617665
## 2 1e-02 0.5 0.39225952 0.073074319
## 3 1e-01 0.5 0.08479546 0.018008475
## 4 1e+00 0.5 0.06403344 0.017487465
## 5 5e+00 0.5 0.06947274 0.023052725
## 6 1e+01 0.5 0.07196826 0.025563570
## 7 1e+02 0.5 0.07736469 0.031480222
## 8 1e-03 1.0 0.48701642 0.092222527
## 9 1e-02 1.0 0.46450268 0.087598554
## 10 1e-01 1.0 0.27853302 0.051749479
## 11 1e+00 1.0 0.09805619 0.017002176
## 12 5e+00 1.0 0.10315891 0.021590590
## 13 1e+01 1.0 0.10444029 0.023985435
## 14 1e+02 1.0 0.10444356 0.023981735
## 15 1e-03 2.0 0.48883144 0.092575461
## 16 1e-02 2.0 0.48210414 0.090932581
## 17 1e-01 2.0 0.41946497 0.076343025
## 18 1e+00 2.0 0.20360650 0.009336804
## 19 5e+00 2.0 0.20469786 0.011348028
## 20 1e+01 2.0 0.20469786 0.011348028
## 21 1e+02 2.0 0.20469786 0.011348028
## 22 1e-03 3.0 0.48900970 0.092611431
## 23 1e-02 3.0 0.48362255 0.091190782
## 24 1e-01 3.0 0.43294417 0.078198238
## 25 1e+00 3.0 0.23066226 0.007152203
## 26 5e+00 3.0 0.23087457 0.007553667
## 27 1e+01 3.0 0.23087457 0.007553667
## 28 1e+02 3.0 0.23087457 0.007553667
## 29 1e-03 4.0 0.48907195 0.092626479
## 30 1e-02 4.0 0.48405713 0.091236087
## 31 1e-01 4.0 0.43653305 0.078345767
## 32 1e+00 4.0 0.23550898 0.006188529
## 33 5e+00 4.0 0.23557570 0.006270171
## 34 1e+01 4.0 0.23557570 0.006270171
## 35 1e+02 4.0 0.23557570 0.006270171
##
## Call:
## best.tune(method = svm, train.x = median_mpg ~ ., data = bd_auto,
## ranges = list(cost = c(0.001, 0.01, 0.1, 1, 5, 10, 100), gamma = c(0.5,
## 1, 2, 3, 4)), kernel = "radial")
##
##
## Parameters:
## SVM-Type: eps-regression
## SVM-Kernel: radial
## cost: 1
## gamma: 0.5
## epsilon: 0.1
##
##
## Number of Support Vectors: 281
Modelo con kernel polynomial
tune.auto3 <- tune(svm, median_mpg ~ ., data = bd_auto, kernel = "polynomial", ranges = list(cost=c(0.001 , 0.01, 0.1, 1,5,10,100),gamma=c(0.5,1,2,3,4), degree=c(2,3,4)))##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost gamma degree
## 0.001 2 3
##
## - best performance: 0.1118502
##
## - Detailed performance results:
## cost gamma degree error dispersion
## 1 1e-03 0.5 2 0.3037207 0.03913115
## 2 1e-02 0.5 2 0.2066717 0.02459985
## 3 1e-01 0.5 2 0.1346013 0.04045215
## 4 1e+00 0.5 2 0.1334015 0.04167428
## 5 5e+00 0.5 2 0.1576084 0.03699263
## 6 1e+01 0.5 2 0.1803726 0.03344241
## 7 1e+02 0.5 2 0.2529667 0.04313562
## 8 1e-03 1.0 2 0.2560198 0.03328955
## 9 1e-02 1.0 2 0.1588127 0.02100216
## 10 1e-01 1.0 2 0.1271540 0.04298727
## 11 1e+00 1.0 2 0.1519365 0.03696317
## 12 5e+00 1.0 2 0.2042271 0.03107685
## 13 1e+01 1.0 2 0.2198195 0.03336798
## 14 1e+02 1.0 2 0.2731809 0.05179575
## 15 1e-03 2.0 2 0.1894369 0.02107246
## 16 1e-02 2.0 2 0.1304596 0.04250981
## 17 1e-01 2.0 2 0.1370726 0.04035350
## 18 1e+00 2.0 2 0.1971576 0.02940547
## 19 5e+00 2.0 2 0.2403235 0.03783207
## 20 1e+01 2.0 2 0.2731809 0.05179575
## 21 1e+02 2.0 2 0.2731809 0.05179575
## 22 1e-03 3.0 2 0.1630548 0.02031295
## 23 1e-02 3.0 2 0.1270127 0.04297037
## 24 1e-01 3.0 2 0.1496705 0.03706794
## 25 1e+00 3.0 2 0.2182405 0.03330025
## 26 5e+00 3.0 2 0.2731812 0.05180174
## 27 1e+01 3.0 2 0.2731812 0.05180174
## 28 1e+02 3.0 2 0.2731812 0.05180174
## 29 1e-03 4.0 2 0.1420894 0.02886099
## 30 1e-02 4.0 2 0.1286486 0.04185520
## 31 1e-01 4.0 2 0.1651422 0.03773183
## 32 1e+00 4.0 2 0.2311673 0.03506680
## 33 5e+00 4.0 2 0.2731809 0.05179575
## 34 1e+01 4.0 2 0.2731809 0.05179575
## 35 1e+02 4.0 2 0.2731809 0.05179575
## 36 1e-03 0.5 3 0.1414858 0.02930834
## 37 1e-02 0.5 3 0.1170883 0.04161977
## 38 1e-01 0.5 3 0.1172960 0.06223539
## 39 1e+00 0.5 3 0.1547983 0.07114794
## 40 5e+00 0.5 3 0.1824597 0.06594286
## 41 1e+01 0.5 3 0.1920270 0.07872333
## 42 1e+02 0.5 3 0.2151585 0.07038655
## 43 1e-03 1.0 3 0.1199604 0.04083700
## 44 1e-02 1.0 3 0.1141842 0.05818515
## 45 1e-01 1.0 3 0.1504562 0.07225655
## 46 1e+00 1.0 3 0.1904213 0.07620357
## 47 5e+00 1.0 3 0.2151585 0.07038655
## 48 1e+01 1.0 3 0.2151585 0.07038655
## 49 1e+02 1.0 3 0.2151585 0.07038655
## 50 1e-03 2.0 3 0.1118502 0.05466670
## 51 1e-02 2.0 3 0.1444370 0.07164791
## 52 1e-01 2.0 3 0.1859255 0.06970397
## 53 1e+00 2.0 3 0.2151585 0.07038655
## 54 5e+00 2.0 3 0.2151585 0.07038655
## 55 1e+01 2.0 3 0.2151585 0.07038655
## 56 1e+02 2.0 3 0.2151585 0.07038655
## 57 1e-03 3.0 3 0.1268594 0.06999682
## 58 1e-02 3.0 3 0.1656884 0.06722630
## 59 1e-01 3.0 3 0.2011184 0.07454636
## 60 1e+00 3.0 3 0.2151609 0.07038467
## 61 5e+00 3.0 3 0.2151609 0.07038467
## 62 1e+01 3.0 3 0.2151609 0.07038467
## 63 1e+02 3.0 3 0.2151609 0.07038467
## 64 1e-03 4.0 3 0.1380603 0.07077995
## 65 1e-02 4.0 3 0.1827353 0.06619124
## 66 1e-01 4.0 3 0.2151585 0.07038655
## 67 1e+00 4.0 3 0.2151585 0.07038655
## 68 5e+00 4.0 3 0.2151585 0.07038655
## 69 1e+01 4.0 3 0.2151585 0.07038655
## 70 1e+02 4.0 3 0.2151585 0.07038655
## 71 1e-03 0.5 4 0.1723381 0.02687206
## 72 1e-02 0.5 4 0.1745745 0.09661690
## 73 1e-01 0.5 4 0.2322382 0.14263048
## 74 1e+00 0.5 4 0.3238749 0.11425828
## 75 5e+00 0.5 4 0.3743967 0.13332628
## 76 1e+01 0.5 4 0.3841191 0.13537137
## 77 1e+02 0.5 4 0.3948103 0.13856391
## 78 1e-03 1.0 4 0.1984546 0.13803729
## 79 1e-02 1.0 4 0.2426577 0.12739937
## 80 1e-01 1.0 4 0.3402189 0.11322789
## 81 1e+00 1.0 4 0.3948103 0.13856391
## 82 5e+00 1.0 4 0.3948103 0.13856391
## 83 1e+01 1.0 4 0.3948103 0.13856391
## 84 1e+02 1.0 4 0.3948103 0.13856391
## 85 1e-03 2.0 4 0.2643480 0.11860975
## 86 1e-02 2.0 4 0.3503557 0.10962565
## 87 1e-01 2.0 4 0.3948103 0.13856391
## 88 1e+00 2.0 4 0.3948103 0.13856391
## 89 5e+00 2.0 4 0.3948103 0.13856391
## 90 1e+01 2.0 4 0.3948103 0.13856391
## 91 1e+02 2.0 4 0.3948103 0.13856391
## 92 1e-03 3.0 4 0.3350223 0.11508852
## 93 1e-02 3.0 4 0.3937248 0.13789661
## 94 1e-01 3.0 4 0.3948081 0.13856314
## 95 1e+00 3.0 4 0.3948081 0.13856314
## 96 5e+00 3.0 4 0.3948081 0.13856314
## 97 1e+01 3.0 4 0.3948081 0.13856314
## 98 1e+02 3.0 4 0.3948081 0.13856314
## 99 1e-03 4.0 4 0.3659839 0.12252747
## 100 1e-02 4.0 4 0.3948103 0.13856391
## 101 1e-01 4.0 4 0.3948103 0.13856391
## 102 1e+00 4.0 4 0.3948103 0.13856391
## 103 5e+00 4.0 4 0.3948103 0.13856391
## 104 1e+01 4.0 4 0.3948103 0.13856391
## 105 1e+02 4.0 4 0.3948103 0.13856391
##
## Call:
## best.tune(method = svm, train.x = median_mpg ~ ., data = bd_auto,
## ranges = list(cost = c(0.001, 0.01, 0.1, 1, 5, 10, 100), gamma = c(0.5,
## 1, 2, 3, 4), degree = c(2, 3, 4)), kernel = "polynomial")
##
##
## Parameters:
## SVM-Type: eps-regression
## SVM-Kernel: polynomial
## cost: 0.001
## degree: 3
## gamma: 2
## coef.0: 0
## epsilon: 0.1
##
##
## Number of Support Vectors: 339
Make some plots to back up your assertions in (b) and (c). Hint: In the lab, we used the plot() function for svm objects only in cases with p = 2. When p > 2, you can use the plot() function to create plots displaying pairs of variables at a time. Essentially, instead of typing
Con el fin de justificar los resultados obtenidos en los literales anteriores, se procedió a realizar los gráficos correspondientes a los modelos ajustados sobre diferentes núcleos y parámetros óptimos, en dichos gráficos el eje Y corresponde a la variable mpg y en el eje x cada una de las covariables, para así observar la relación entre cada covariable y la variable respuesta
bd_auto <- data.frame(Auto,median_mpg)
bd_auto$median_mpg <- as.factor(bd_auto$median_mpg)
svm.linear <- svm(median_mpg ~ ., data = bd_auto, kernel = "linear", cost = 0.1)
svm.poly <- svm(median_mpg ~ ., data = bd_auto, kernel = "polynomial", cost = 0.001, degree = 3, gamma=2)
svm.radial <- svm(median_mpg ~ ., data = bd_auto, kernel = "radial", cost = 1, gamma = 0.5)
plotpairs = function(fit) {
for (name in names(bd_auto)[!(names(bd_auto) %in% c("mpg", "median_mpg", "name"))]) {
plot(svm.linear, bd_auto, as.formula(paste("mpg~", name, sep = "")))
}
}
plotpairs(svm.linear)This problem involves the OJ data set which is part of the ISLR package.
Literales:
Create a training set containing a random sample of 800 observations, and a test set containing the remaining observations.
Creamos un conjunto de entrenamiento que contiene una muestra aleatoria de 800 observaciones y un conjunto de prueba que contiene las observaciones restantes.
Fit a support vector classifier to the training data using cost=0.01, with Purchase as the response and the other variables as predictors. Use the summary() function to produce summary statistics, and describe the results obtained.
Modelo ajustado de SVM a los datos de entrenamiento usando “costo”=0.01
##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "linear", cost = 0.01)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 0.01
##
## Number of Support Vectors: 435
##
## ( 219 216 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
What are the training and test error rates?
Uzaremos las matrices de confuacion para responder esta pregunta, como sigue:
train.pred <- predict(svm.oj, OJ.train)
svm_oj <- table(OJ.train$Purchase, train.pred)
svm_oj %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 420 | 65 |
| MM | 75 | 240 |
clasificacion_svmbuena <- (svm_oj[1,1]+svm_oj[2,2])/length(OJ.train$Purchase)
clasificacion_svmala <- (svm_oj[1,2]+svm_oj[2,1])/length(OJ.train$Purchase)
kable(clasificacion_svmbuena, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.825 |
| Clasificación mala |
|---|
| 0.175 |
test.pred <- predict(svm.oj, OJ.test)
svm_ojt <- table(OJ.test$Purchase, test.pred)
svm_oj %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 420 | 65 |
| MM | 75 | 240 |
clasificacion_svmbuena <- (svm_ojt[1,1]+svm_ojt[2,2])/length(OJ.test$Purchase)
clasificacion_svmala <- (svm_oj[1,2]+svm_ojt[2,1])/length(OJ.test$Purchase)
kable(clasificacion_svmbuena, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.8222222 |
| Clasificación mala |
|---|
| 0.362963 |
Use the tune() function to select an optimal cost. Consider values in the range 0.01 to 10.
Selección de un “costo” óptimo con tune.
set.seed(2)
tune.out <- tune(svm, Purchase ~ ., data = OJ.train, kernel = "linear", ranges = list(cost = seq(0.01, 10, by = 0.25)))
summary(tune.out)##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 4.01
##
## - best performance: 0.165
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.17625 0.04059026
## 2 0.26 0.16750 0.04257347
## 3 0.51 0.17125 0.04126894
## 4 0.76 0.16875 0.04259385
## 5 1.01 0.17000 0.04090979
## 6 1.26 0.16750 0.03782269
## 7 1.51 0.16750 0.03782269
## 8 1.76 0.16750 0.03782269
## 9 2.01 0.16750 0.03782269
## 10 2.26 0.16750 0.03782269
## 11 2.51 0.16750 0.03827895
## 12 2.76 0.16875 0.03738408
## 13 3.01 0.16875 0.03738408
## 14 3.26 0.16750 0.03782269
## 15 3.51 0.16625 0.03634805
## 16 3.76 0.16625 0.03634805
## 17 4.01 0.16500 0.03809710
## 18 4.26 0.16500 0.03809710
## 19 4.51 0.16500 0.03809710
## 20 4.76 0.16625 0.03634805
## 21 5.01 0.16750 0.03545341
## 22 5.26 0.16750 0.03545341
## 23 5.51 0.16750 0.03545341
## 24 5.76 0.16750 0.03545341
## 25 6.01 0.16750 0.03545341
## 26 6.26 0.16750 0.03545341
## 27 6.51 0.16750 0.03545341
## 28 6.76 0.16625 0.03729108
## 29 7.01 0.16750 0.03545341
## 30 7.26 0.16750 0.03545341
## 31 7.51 0.16750 0.03545341
## 32 7.76 0.16750 0.03545341
## 33 8.01 0.16750 0.03545341
## 34 8.26 0.16750 0.03545341
## 35 8.51 0.17000 0.03736085
## 36 8.76 0.17000 0.03736085
## 37 9.01 0.17000 0.03736085
## 38 9.26 0.17000 0.03736085
## 39 9.51 0.17000 0.03736085
## 40 9.76 0.17000 0.03736085
Se puede observar que el error es menor cuando se utiliza un cost igual a 4.01, teniendo conocimiento de esto, obtengamos el mejor modelo
##
## Call:
## best.tune(method = svm, train.x = Purchase ~ ., data = OJ.train,
## ranges = list(cost = seq(0.01, 10, by = 0.25)), kernel = "linear")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: linear
## cost: 4.01
##
## Number of Support Vectors: 329
Compute the training and test error rates using this new value for cost.
Ajuste modelo svm con el costo optimo
train.pred <- predict(svm.li, OJ.train)
svm1 <- table(OJ.train$Purchase, train.pred)
svm1 %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 423 | 62 |
| MM | 70 | 245 |
clasificacion_svmbue <- (svm1[1,1]+svm1[2,2])/length(OJ.train$Purchase)
clasificacion_svmal <- (svm1[1,2]+svm1[2,1])/length(OJ.train$Purchase)
kable(clasificacion_svmbue, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.835 |
| Clasificación mala |
|---|
| 0.165 |
test.pred <- predict(svm.li, OJ.test)
svm_test <-table(OJ.test$Purchase, test.pred)
svm_test %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 155 | 13 |
| MM | 29 | 73 |
clasificacion_svmbu <- (svm_test[1,1]+svm_test[2,2])/length(OJ.test$Purchase)
clasificacion_svma <- (svm_test[1,2]+svm_test[2,1])/length(OJ.test$Purchase)
kable(clasificacion_svmbu, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.8444444 |
| Clasificación mala |
|---|
| 0.1555556 |
Repeat parts (b) through (e) using a support vector machine with a radial kernel. Use the default value for gamma.
Ajuste modelo svm con nucleo radial y parámetro gamma predeterminado.
##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "radial")
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1
##
## Number of Support Vectors: 373
##
## ( 188 185 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train.predr <- predict(svm.radial, OJ.train)
svm_rad <- table(OJ.train$Purchase, train.predr)
svm_rad %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 441 | 44 |
| MM | 77 | 238 |
Tasas de clasificación
clasificacion_svmbue <- (svm_rad[1,1]+svm_rad[2,2])/length(OJ.train$Purchase)
clasificacion_svmal <- (svm_rad[1,2]+svm_rad[2,1])/length(OJ.train$Purchase)
kable(clasificacion_svmbue, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.84875 |
| Clasificación mala |
|---|
| 0.15125 |
Matriz de confusión en Train
test.predr <- predict(svm.radial, OJ.test)
svm_radt <-table(OJ.test$Purchase, test.predr)
svm_rad %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 441 | 44 |
| MM | 77 | 238 |
Tasas de clasificación
clasificacion_svmbue <- (svm_radt[1,1]+svm_radt[2,2])/length(OJ.test$Purchase)
clasificacion_svmal <- (svm_radt[1,2]+svm_radt[2,1])/length(OJ.test$Purchase)
kable(clasificacion_svmbue, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.8148148 |
| Clasificación mala |
|---|
| 0.1851852 |
set.seed(2)
tune.outra <- tune(svm, Purchase ~ ., data = OJ.train, kernel = "radial", ranges = list(cost = seq(0.01, 10, by = 0.25)))
summary(tune.outra)##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 1.26
##
## - best performance: 0.17
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.39375 0.03240906
## 2 0.26 0.17375 0.03972562
## 3 0.51 0.17500 0.03908680
## 4 0.76 0.17250 0.03855011
## 5 1.01 0.17125 0.03283481
## 6 1.26 0.17000 0.03446012
## 7 1.51 0.17500 0.03632416
## 8 1.76 0.17750 0.03717451
## 9 2.01 0.17750 0.03525699
## 10 2.26 0.17875 0.03283481
## 11 2.51 0.17750 0.03622844
## 12 2.76 0.17875 0.03387579
## 13 3.01 0.18250 0.03343734
## 14 3.26 0.18375 0.03438447
## 15 3.51 0.18375 0.03438447
## 16 3.76 0.18375 0.03682259
## 17 4.01 0.18375 0.03682259
## 18 4.26 0.18500 0.03899786
## 19 4.51 0.18375 0.03682259
## 20 4.76 0.18375 0.03682259
## 21 5.01 0.18375 0.03682259
## 22 5.26 0.18375 0.03682259
## 23 5.51 0.18500 0.03717451
## 24 5.76 0.18500 0.03717451
## 25 6.01 0.18500 0.03717451
## 26 6.26 0.18500 0.03717451
## 27 6.51 0.18500 0.03717451
## 28 6.76 0.18625 0.03747684
## 29 7.01 0.18750 0.03535534
## 30 7.26 0.18750 0.03535534
## 31 7.51 0.18750 0.03535534
## 32 7.76 0.18750 0.03535534
## 33 8.01 0.18750 0.03535534
## 34 8.26 0.18625 0.03356689
## 35 8.51 0.18750 0.03118048
## 36 8.76 0.18625 0.03087272
## 37 9.01 0.18500 0.03216710
## 38 9.26 0.18500 0.03050501
## 39 9.51 0.18625 0.03030516
## 40 9.76 0.18750 0.03173239
Ajuste modelo svm con costo óptimo
svm.radialc <- svm(Purchase ~ ., kernel = "radial", data = OJ.train, cost = tune.outra$best.parameter$cost)
summary(svm.radialc)##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "radial", cost = tune.outra$best.parameter$cost)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: radial
## cost: 1.26
##
## Number of Support Vectors: 367
##
## ( 186 181 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train.pred <- predict(svm.radialc, OJ.train)
mta<- table(OJ.train$Purchase, train.pred)
mta %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 441 | 44 |
| MM | 74 | 241 |
Tasas de clasificación
clasificacion_svmbuenc <- (mta[1,1]+mta[2,2])/length(OJ.train$Purchase)
clasificacion_svmalac <- (mta[1,2]+mta[2,1])/length(OJ.train$Purchase)
kable(clasificacion_svmbuenc, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.8525 |
| Clasificación mala |
|---|
| 0.1475 |
Matriz de confusión datos de prueba modelo radial
test.pred <- predict(svm.radialc, OJ.test)
mop <- table(OJ.test$Purchase, test.pred)
mop %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 152 | 16 |
| MM | 32 | 70 |
Tasa de clasificación
clasificacion_svmbueop <- (mop[1,1]+mop[2,2])/length(OJ.test$Purchase)
clasificacion_svmalop <- (mop[1,2]+mop[2,1])/length(OJ.test$Purchase)
kable(clasificacion_svmbueop, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.8222222 |
| Clasificación mala |
|---|
| 0.1777778 |
Repeat parts (b) through (e) using a support vector machine with a polynomial kernel. Set degree=2.
Modelo ajustado svm con nucleo polinomial grado=2.
##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "polynomial",
## degree = 2)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 1
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 447
##
## ( 225 222 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train.predrpo <- predict(svm.poly, OJ.train)
svm_poly <- table(OJ.train$Purchase, train.predrpo)
svm_poly %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 449 | 36 |
| MM | 110 | 205 |
Tasas de clasificación
clasificacion_svmbuepo <- (svm_poly[1,1]+svm_poly[2,2])/length(OJ.train$Purchase)
clasificacion_svmalpo <- (svm_poly[1,2]+svm_poly[2,1])/length(OJ.train$Purchase)
kable(clasificacion_svmbuepo, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.8175 |
| Clasificación mala |
|---|
| 0.1825 |
Matriz de confusión datos de prueba
test.predpol <- predict(svm.poly, OJ.test)
svm_pol <-table(OJ.test$Purchase, test.predpol)
svm_pol %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 153 | 15 |
| MM | 45 | 57 |
Tasas de clasificación
clasificacion_svmbuep <- (svm_pol[1,1]+svm_pol[2,2])/length(OJ.test$Purchase)
clasificacion_svmalp <- (svm_pol[1,2]+svm_pol[2,1])/length(OJ.test$Purchase)
kable(clasificacion_svmbuep, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.7777778 |
| Clasificación mala |
|---|
| 0.2222222 |
set.seed(2)
tune.outrapo <- tune(svm, Purchase ~ ., data = OJ.train, kernel = "polynomial",degree=2, ranges = list(cost = seq(0.01, 10, by = 0.25)))
summary(tune.outrapo)##
## Parameter tuning of 'svm':
##
## - sampling method: 10-fold cross validation
##
## - best parameters:
## cost
## 2.76
##
## - best performance: 0.17625
##
## - Detailed performance results:
## cost error dispersion
## 1 0.01 0.39000 0.03670453
## 2 0.26 0.20625 0.04050463
## 3 0.51 0.20250 0.03670453
## 4 0.76 0.20000 0.04124790
## 5 1.01 0.19750 0.04116363
## 6 1.26 0.19875 0.04143687
## 7 1.51 0.19250 0.03736085
## 8 1.76 0.19250 0.03545341
## 9 2.01 0.18500 0.03809710
## 10 2.26 0.18250 0.03872983
## 11 2.51 0.17875 0.04566256
## 12 2.76 0.17625 0.04226652
## 13 3.01 0.17625 0.04505013
## 14 3.26 0.18000 0.04005205
## 15 3.51 0.17875 0.03955042
## 16 3.76 0.17875 0.03998698
## 17 4.01 0.18000 0.03961621
## 18 4.26 0.18250 0.04005205
## 19 4.51 0.18125 0.04007372
## 20 4.76 0.17875 0.03634805
## 21 5.01 0.18000 0.03872983
## 22 5.26 0.18125 0.04050463
## 23 5.51 0.18000 0.04133199
## 24 5.76 0.18000 0.04133199
## 25 6.01 0.18000 0.03872983
## 26 6.26 0.17750 0.03717451
## 27 6.51 0.17875 0.03682259
## 28 6.76 0.17875 0.03682259
## 29 7.01 0.17875 0.03682259
## 30 7.26 0.17875 0.03682259
## 31 7.51 0.18000 0.03496029
## 32 7.76 0.18000 0.03343734
## 33 8.01 0.18000 0.03343734
## 34 8.26 0.17875 0.03537988
## 35 8.51 0.17875 0.03537988
## 36 8.76 0.18000 0.03827895
## 37 9.01 0.18125 0.03875224
## 38 9.26 0.18000 0.03827895
## 39 9.51 0.18000 0.03827895
## 40 9.76 0.18125 0.03830162
Ajuste modelo svm con costo óptimo
svm.polyc <- svm(Purchase ~ ., kernel = "polynomial",degree=2, data = OJ.train, cost = tune.outrapo$best.parameter$cost)
summary(svm.polyc)##
## Call:
## svm(formula = Purchase ~ ., data = OJ.train, kernel = "polynomial",
## degree = 2, cost = tune.outrapo$best.parameter$cost)
##
##
## Parameters:
## SVM-Type: C-classification
## SVM-Kernel: polynomial
## cost: 2.76
## degree: 2
## coef.0: 0
##
## Number of Support Vectors: 387
##
## ( 197 190 )
##
##
## Number of Classes: 2
##
## Levels:
## CH MM
train.predpo <- predict(svm.polyc, OJ.train)
poli<- table(OJ.train$Purchase, train.predpo)
poli %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 452 | 33 |
| MM | 92 | 223 |
Tasa de clasificación buena y mala
clasificacion_svmbuenp <- (poli[1,1]+poli[2,2])/length(OJ.train$Purchase)
clasificacion_svmalap <- (poli[1,2]+poli[2,1])/length(OJ.train$Purchase)
kable(clasificacion_svmbuenp, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.84375 |
| Clasificación mala |
|---|
| 0.15625 |
Matriz de confusión para datos de entrenamiento modelo polinomial
test.predpo <- predict(svm.polyc, OJ.test)
polyc <- table(OJ.test$Purchase, test.predpo)
polyc %>% kable() %>%
kable_styling(bootstrap_options = c("striped", "hover"))| CH | MM | |
|---|---|---|
| CH | 153 | 15 |
| MM | 40 | 62 |
Tasa de clasificación buena y mala
clasificacion_svmbueopt <- (polyc[1,1]+polyc[2,2])/length(OJ.test$Purchase)
clasificacion_svmalopt <- (polyc[1,2]+polyc[2,1])/length(OJ.test$Purchase)
kable(clasificacion_svmbueopt, col.names="Clasificacion buena")| Clasificacion buena |
|---|
| 0.7962963 |
| Clasificación mala |
|---|
| 0.2037037 |
Overall, which approach seems to give the best results on this data?
La salud cardiovascular y respiratoria de la población, tanto a largo como a corto plazo depende en gran parte de la calidad del aire. Cada día la contaminación del aire en el mundo va en aumento, es tan así que actualmente es uno de los problemas ambientales más severos a nivel mundial, ya que está presente en todas las sociedades, independientemente del nivel de desarrollo socioeconómico, y constituye un fenómeno que tiene fuerte incidencia sobre la salud de los humanos. El crecimiento económico y la urbanización, asociados al desarrollo de diversas actividades como la industria petrolera, los servicios, la agroindustria y el incremento de las unidades automotoras, traen como resultado elevados volúmenes de contaminantes.
El área metropolitana del Valle de Aburrá, en particular la ciudad de Medellín no se queda atrás con esta problemática. Desde ya varios años vienen implementando estrategias para reducir el problema centrado principalmente en la cantidad de automóviles en la ciudad, para esto han articulado la Estrategia Nacional de Calidad del aire que prioriza acciones enfocadas en la reducción de emisiones contaminantes generadas por los vehículos automotores y las actividades productivas de servicio haciéndolo parte de uno de los objetivos del desarrollo sostenible (ODS) dado que es un problema ambiental que aqueja a la población para al menos los próximos 15 años.
La Alcaldía de Medellín en el portal de datos abiertos cuenta con información de contaminantes obtenida por la red de monitoreo de calidad del aire, discriminada por cada contaminante, además con un conjunto de datos llamado Encuesta de Calidad de Vida – Medellín Como Vamos, la cual consta de 342 preguntas cubriendo 15 dimensiones sociales en la que se encuentra la calidad del medio ambiente y su percepción según los habitantes de cada barrio de la ciudad. Con estos datos y el uso aprendizaje estadístico se puede entender mejor algunos fenómenos de la calidad del aire dentro de la ciudad, por ejemplo con Análisis de Componentes Principales se puede crear un indicador intraurbano que cuantifique la gravedad del aire según la percepción de los ciudadanos conjugada con la información del monitoreo, también estimar acertadamente el número de vehículos por familia en cada barrio y así plantear otros tipos de estrategias a parte del pico y placa ambiental que ayuden con la reducción de movilidad particular.
Sin embargo, el horizonte es poco alentador en unos años sobre la calidad del aire dado el aumento del parque automotor y crecimiento comercial, por eso es importante también diseñar otros tipos de encuesta y recolección de datos con el fin de llegar a factores de riesgo más certeros con los cuales se pueda atacar el problema de raíz y dar una solución más grande y duradera a este problema de Salud Pública, además de crear otras encuestas de percepción sobre el uso del transporte público y la bicicleta para implementar estrategias que consoliden estos medios de transporte por encima del particular.
En conclusión ,la principal consecuencia de la calidad del aire es la afectación a la salud, una de las principales causas es la cantidad de vehiculos con su constante crecimiento y una solución efectiva es la identificación de relaciones causantes de la mala calidad del aire a partir de información recolectada día a día
OMS (2005). Guías para la Calidad del Aire - Resumen de Evaluación de los Riesgos.
Clean Air Institute (2016) Plan integral de gestión de la calidad del aire para el área metropolitana del Valle de Aburá. Obtenido de: https://www.metropol.gov.co/ambiental/calidad-del-aire/Documents/PIGECA/PIGECA-Aprobado-Dic-2017.pdf
Romero M; Olite D; Alvarez Mireza (2006). La contaminación del aire: su repercusión como problema de salud. Revista Cubana de Higiene y Epidemiología.
OMS 2018. Calidad del aire y salud - Datos y cifras. Obtenido de: https://www.who.int/es/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health
Medellín Como Vamos. (2019). Indice de Progreso Social Comunas y Corregimientos de Medellin 2013 - 2017 - Reporte Metodologico